Stanford CS329H: Machine Learning from Human Preferences | Autumn 2024 | Voting
By Unknown Author
Polus: An Open Source Platform for Meaning-Making at Scale - Detailed Summary
Key Concepts:
- Polus: An open-source platform and methodology for gathering and making sense of diverse perspectives at scale. It involves submitting statements, voting (agree, disagree, pass), and using PCA and K-means clustering to identify groups and consensus.
- Sparse Matrix: The core data structure in Polus, representing user voting records on statements. Rows are participants, columns are statements, and values indicate agreement, disagreement, or pass.
- PCA (Principal Component Analysis) & K-means Clustering: Statistical methods used to reduce the dimensionality of the sparse matrix and identify clusters of users with similar voting patterns.
- Community Notes (Twitter): A misinformation response system on Twitter that utilizes a methodology inspired by Polus to identify helpful notes based on diverse perspectives.
- LLMs (Large Language Models): Explored for their potential to enhance Polus through tasks like moderation, summarization, and vote prediction, but also recognized for their associated risks.
- Emergent Survey: The concept of Polus as a survey created by the people taking it, where statements are submitted by participants and voted on by others.
- Synthetic Experts: A potential threat where LLMs are used to create fake experts who promote a particular agenda, making it difficult to distinguish real expertise from misinformation.
- Policy Innovation Units: Government entities that experiment with new methods, like Polus, to gather public opinion for policy-making.
- ZK (Zero-Knowledge Proofs): A cryptographic technique that could be used to verify user identities anonymously, potentially mitigating the risk of adversarial attacks on Polus.
1. Introduction to Polus and Colin McIll
- Colin McIll introduces himself as the creator of Polus, an open-source platform he's been working on for 13 years.
- He outlines the talk's structure: his background, the project's history, and recent developments like the grant program, Open AAI, and Community Notes on Twitter.
- McIll's background is in international relations and political science, followed by experience in startups.
- Polus began as a for-profit pro-social startup (2012-2016) before transitioning to open-source (2016) and then a nonprofit (2019).
- He offers to discuss the nonprofit technology model, citing examples like Wikipedia, Khan Academy, and Signal.
- Polus is deployed by national governments (UK, Finland, Singapore, Taiwan, Amsterdam) and hobbyists.
2. Core Functionality and Methodology
- Polus allows users to submit statements, which are then shown randomly to other users.
- Users can agree, disagree, or pass on these statements, creating a sparse matrix of voting data.
- This process is described as an "emergent survey," where the survey is created by the participants.
- The goal is to make meaning of perspectives at scale, inspired by the challenges of Occupy Wall Street and the Arab Spring.
- The core algorithm involves PCA and K-means clustering on the sparse matrix to identify groups with similar viewpoints.
- The system is designed to be "set and forget," meaning it can be used by facilitators without requiring constant monitoring of the underlying algorithms.
- Each statement is treated as a feature, and each participant's voting record is a row in the matrix.
- The heuristic involves identifying statements that are common among people who think differently.
- This metric formed the basis for conversations with Twitter about Community Notes.
3. Robustness and Adversarial Attacks
- The discussion addresses the robustness of Polus against adversarial attacks, particularly attempts to game the system.
- An example is given of Uber's attempt to influence a Polus conversation in Taiwan by sending their drivers to participate.
- The Taiwanese government moderated the statements, limiting the surface area for gaming.
- The PCA and K-means algorithms are vulnerable to manipulation through voting patterns.
- The speaker acknowledges that it's now easier to build bots that can vote consistently, potentially creating artificial clusters.
- The ultimate defense against such attacks is seen as improving the sampling process and verifying user identities.
- ZK-based anonymous but verified identity solutions are mentioned as a promising area.
- An open area of research is identifying synthetic groups by analyzing the variance within clusters.
- The speaker suggests that LLMs might vote differently than humans, potentially allowing for the detection of bot activity.
- A concrete example of an adversarial objective is an industry player attempting to skew a national policy dialogue by adding synthetic extremist groups.
- The most dangerous threat is seen as "synthetic experts" who can create a misleadingly credible case for a particular agenda.
4. Statement Submission and Emergence of Issues
- The conversation addresses how questions and issues emerge naturally within the Polus platform.
- An example is given of a conversation in the San Juan Islands about land bank policy.
- The conversation started with a general prompt, such as "What are your thoughts and feelings on the land bank?"
- Participants then submitted statements expressing their opinions, which were voted on by others.
- The system identifies statements that are widely agreed upon or controversial.
- The speaker clarifies that the system works by having people submit statements, which are then shown to others who can agree, disagree, or pass.
5. LLMs and Polus: Opportunities and Risks
- The speaker discusses a paper written with Anthropic exploring the use of LLMs in Polus.
- The paper examined tasks like moderation, summarization, and vote prediction.
- Summarization, grounded in statements, is a promising area, with efforts to ensure clause-level grounding.
- LLMs were found to be surprisingly good at predicting people's votes, which raises concerns about the potential for synthetic voters.
- There's a risk that LLMs could be used to replace human participants in social research, further intermediating the public from institutions.
- The methodology for vote prediction involves giving the LLM progressively more statements and votes and then graphing its accuracy.
- The speaker expresses concern that people might mistake LLMs for real voters, leading to flawed conclusions.
- Topic modeling using LLMs is presented as an alternative to PCA and clustering, done in context.
- The LLM is used to generate narratives based on the statistical outputs of previous steps.
- The speaker references Harvey, the AI lawyer from OpenAI, as an example of how to reduce hallucinations in LLMs.
6. Real-World Applications and Case Studies
- The speaker highlights several real-world applications of Polus.
- A UN Environment Programme (UNEP) project in Timor-Leste used Polus to gather opinions from people in tuk-tuks.
- Satakunta, Finland, has run large-scale Polus conversations involving up to 18,000 people.
- DeepMind published a paper on using AI to help humans find common ground, citing the Anthropic paper as inspiration.
- The DeepMind project explored whether LLMs could write collective statements that are more approved than those written by humans.
- OpenAI funded a grant program to explore mapping human values to AI completions, with Polus used as a tool.
- Anthropic conducted a project to analyze Claude's constitution using Polus.
7. Community Notes on Twitter
- The speaker discusses the use of Polus-inspired methodology in Community Notes on Twitter.
- The work with Twitter began in 2021, following an article suggesting that Twitter use Polus for misinformation.
- The Birdwatch paper cites Polus's algorithm goal to highlight consensus.
- Community Notes uses matrix factorization for vote prediction, operating in continuous space rather than clustering.
- The goal is to identify notes that people with different points of view will find helpful.
- The platform has been generally well-received and fact-checks both right-wing and left-wing sources.
- The speaker emphasizes that Community Notes is not just a misinformation system but a collective response system.
- He notes that Community Notes has started to engage in collective responses beyond fact-checking, which he predicted.
- There are concerns about Community Notes being used for group harassment.
- The speaker suggests that the system is generalizable and could be adapted to produce mass public opinion on Twitter.
- He expresses hope that someone will implement a similar system on Blue Sky.
8. Future Directions and Opportunities
- The speaker emphasizes the potential for Polus to connect political scientists and computer scientists.
- He envisions a future where open-source systems for mapping public views are connected to policy innovation units and social platforms.
- He believes there are opportunities to improve clustering algorithms, particularly by incorporating semantic information.
- Better vote prediction is another promising area, with potential applications in Community Notes.
- The speaker suggests that new forms of clustering could be used to identify expert signal and differentiate it from noise.
- He also discusses the possibility of reconstructing public arguments by analyzing the logical structure of statements.
9. Addressing Concerns about Emergence and Control
- The speaker addresses concerns that the fixed algorithm used for clustering limits the emergence of issues.
- He argues that the ability for any participant to submit any statement at any time makes the system emergent.
- He clarifies that the clusters are a function of the voting data, not the natural language of the statements.
- He acknowledges the need for more transparency and control over the interpretation of results.
- He announces that raw data files will soon be available, allowing users to perform their own analysis.
- He encourages the creation of tools that allow participants to engage with the data and make their own meaning.
- He suggests that a future direction could be to allow participants to pose meta-level questions about the data and then vote on those questions.
10. Conclusion
- The speaker concludes by reiterating the potential of Polus to facilitate meaningful conversations and inform policy decisions.
- He emphasizes the importance of addressing the risks associated with LLMs and adversarial attacks.
- He encourages further research and development in areas like clustering algorithms, vote prediction, and identity verification.
- He expresses excitement about the future of Polus and its potential to connect diverse perspectives and promote collective intelligence.
Main Takeaways:
Polus is a powerful open-source platform for gathering and analyzing public opinion. Its core strength lies in its ability to identify diverse perspectives and potential areas of consensus. While the platform faces challenges related to adversarial attacks and the use of LLMs, it holds significant promise for informing policy decisions and promoting more inclusive and deliberative conversations. The future of Polus hinges on continued research and development in areas like clustering algorithms, vote prediction, and identity verification, as well as a commitment to transparency and user control.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "Stanford CS329H: Machine Learning from Human Preferences | Autumn 2024 | Voting". What would you like to know?