I extend the same right to authorship to you. In a plural society we trade on reciprocity. I don’t force my values on you, you don’t force yours on me. Persuasion is fair, compulsion is not. Where rules bind everyone, we justify them in shared reasons and keep them inside a rights floor of consent, agency, due process, equality, and non-violence.
Simple. Job done. We can all go home.
Of course it’s not simple.
I’ve argued two positions so far: that systems created by humans should be responsive to human values, and that artificial intelligence is technology that could enable it. These next pieces take up four enabling challenges: authorship, articulation, alignment, and adaptation.
It’s a big ask because it forces us to do something we’ve historically been bad at: making our values explicit, negotiating them, binding our systems to them in ways that hold up under stress, and providing avenues for change. But that’s also why it matters. If we don’t, it doesn’t mean that our systems will be values-free. Instead, they’ll just be aligned with silent defaults like profit maximisation or bureaucratic self-preservation.
If we don’t author values in public, systems will author them in private. So, the first challenge is authorship: who decides, by what right, and through which procedures?
Who authors our values?
It’s a big question that we don’t necessarily think about all that often. Who authors our values and how do they do it? I have opinions of course, as I imagine everyone does. My answer is that it depends on the context, the culture, the worldview of the people involved. I’ll get into some of those options, and my own preferences, shortly. But for now, the specifics matter less than a simple principle:
For a society’s systems to align with its people’s values, it must first decide two things: who has the authority to define them, and how those values are created, reviewed, and renewed.
Once again, it’s in the AI alignment space that this issue has been recently highlighted. In the United States, AI watchers are concerned that artificial intelligence will be aligned to values authored by technologists in San Francisco. The authors there are amplifying the values of a capitalist, socially progressive, technocratic, West Coast worldview that is not necessarily shared by the rest of America, much less the world. In China, there is concern that the values of the government are shaping the development of artificial intelligence there. Their priority values include social stability, central control, and national development, with little room for dissenting value sets. In Europe, the central governments are focussed on embedding the values of safety in AI. Their aim is constraining corporate overreach, with the potential consequence of slowing innovation compared to competitors.
These are presented as core values which are shared across a community. In the examples above, they have some level of legitimacy and are not inconsistent with the communities they serve. In the USA, there is a long tradition of innovation, freedom, capitalism, and individualism. They aren’t just slogans, they are the lived identity of the people there. In China, there is a long, continuous cultural history with collective memory of brutal dynastic wars. Stability is not just a bureaucratic priority, it’s a survival strategy. In Europe, memories of authoritarian overreach are fresh in the minds of the older generations. Their declared priority is to protect the individual from the state, corporations, and, eventually, artificial intelligence.
The point is not which of these value sets is right, but that each is distinct. And that in all cases the authorship of values is being performed by systems, either government or corporate. The textbook definition says that a society agrees on its values first and then builds its systems to reflect and uphold them. But that’s not how it usually plays out.
In practice, systems in power tend to define the values set, or at least the parts that get formal recognition. What’s missing is not coherence but completeness. Systems excel at operationalising instrumental values, because they are things they can measure, analyse, enforce, and optimise for. What they struggle to capture are the deeply human dimensions: moral intuitions, aesthetic sensibilities, ambiguity, the willingness to bend a rule for compassion’s sake. Without deliberate human authorship, these softer but vital elements risk being left out entirely. The result is a values framework that reflects the logic of the system more than the full breadth of the people it claims to represent.
Methods of authorship
Over our history, we’ve experimented with how we establish values. In early society it fell to tribal elders, oral tradition, lived experience, and consensus among respected figures. These methods had deep continuity but were slow to adapt, with the potential to exclude alternative views. We have had religious authorities interpret divine truths to be translated into values sets which a community lived by. These beliefs were an anchor which held a community together, but are slow to change, leading to it being challenged as an author of values. There have been authoritarian leaders, kings and dictators, who aim to align the values of their societies to their own. Despite the early clarity of vision they provide, eventually they prove to be brittle, often failing to adapt in the face of new realities.
There has been a tradition of leveraging specialists to shape the values of societies, philosophers advising kings in matters of truth and morality. Our modern form is the academic and their applied technocratic brethren, applying reason and analysis to matters relevant to values. This has the advantage of being evidence based but can be elitist or may undervalue culture or emotion. We also establish values through democratic processes, even indirectly. Politicians talk about values alongside policy but generally discuss only the contested ones which don’t undermine existing power structures, including their own. Democratic values authorship has the advantage of broad legitimacy, but it can be slow, polarised, narrow, and swayed by popularism and short-term thinking. Finally, we have values that emerge from informal communities. These are highly adaptable and foster local autonomy and innovation, at least within the group. They can however be difficult to scale and are often only held by narrow sections of the community.
In practice, societies use a combination of these methods to establish shared values. Each are tools to achieve the same goal.
I’ll put my cards on the table here: if I had to choose how a society establishes its values, it’d be through a participatory process where all the adult moral agents have a say in the values expressed by their systems. Most likely, it’d need to be supported by AI, possibly through guided discussions on values, framed more like a casual chat about the issues of the world than a psychometric survey. You’d take the aggregated results and try to distil fundamental beliefs across populations at a range of geographic resolutions and aim to get a consensus on what core values most of the community share.
This is absolutely informed by my worldview. There are no experts, elders, priests, or glorious leaders telling me what my values are. Instead, values emerge through pluralist lived experience and the shared culture of the community. I believe that you should take values held by a population and build systems that reflect them as they are, rather than trying to shape people to fit the systems. That’s been very hard up until now. It might be possible with AI.
Whatever way we end up deciding to author our values, we need to agree to a method to build consensus. Once we have them, it’s important that we write them down so that our systems can interpret them.
Operationalising authorship with help from AI
I’m trying to approach these challenges with the mindset of an architect. While we need to understand the philosophical principles behind values, morals, and ethics, my ultimate aim is to design and build. The essays I’ve written so far are as much about helping me to formalise my own thinking as they are sharing with the world.
Building values aware tools is possible because of LLMs. We shouldn’t be surprised that large language models excel at tasks using language. I think there are values that probably exist pre-language, but they’d be different to how we understand the term in common usage. I’ve been spending a lot of time experimenting with how LLMs handle language around values and have found some opportunities for using them in the authorship space. I’ll go through these tools in detail in future posts but will explain the broad concepts below.
Surfacing individual values
For the individual, AI can surface our values from conversation or text and extract the that are relevant to our values. A while back I built a tool that takes the hundreds of chats that I’ve had with Chat GPT and performs a comprehensive worldview analysis. It works by iterating through each chat and extracting relevant worldview information using an elicitation template. It then summarised all the chats on a single worldview element (ontology, epistemology, who has agency, concept of time etc.) and produces a long report on user worldview.
This approach worked for me because I spend a lot of time going through philosophical concepts related to my systems projects with LLMs. It means that I have records of information that is rich in worldview context. There is a massive collection bias, mostly that it misses out on all my personal life because I don’t discuss that with the platform. But by aggregating my human expressions over a long period of time it was able to synthesise it all into something which was consistent with my understanding of myself.
I won’t be sharing the whole output, it’s a little too revealing. But when pressing the LLM to distil my worldview into the most important and repeated values it came up with the list below:
Agency, Empathy, Truth, Nonviolence, Fairness, Democracy, Transparency — stewarded by Accountability and Pluralism.
It also gave me some interesting insights into where my values clashed.
Detecting trade-offs
Ned Flanders somehow manages to do everything the bible says, even the stuff that contradicts the other stuff. But we can’t be expected to live up to his lofty moral standards. Our values clash all the time. The goal is never to eliminate the contradictions, it’s to make trade-offs visible and defensible. Thankfully we’re well equipped to constantly make compromises as we go about living our lives as best we can.
From my experiments I’ve concluded that AI can help with detecting trade-offs, surfacing relative weighing, and flagging inconsistencies. It works particularly well when going up against values-rich language like that used by politicians, which we’ll be looking into in a few weeks. But even as a vibe it can be useful for an individual, even just to understand where their own priorities lie when making decisions.
The second part of the worldview values analysis that I had an LLM perform asked it to raise values clashes that were surfaced during the analysis. It gave me some good results:
Transparency vs Safety: I’m a fan of strong transparency in systems to support information flow and feedback. But I’ve worked in high security contexts where I know that secrecy protects people.
Agency vs Structure: I’m all for individual agency and putting the person first. But every framework or tool I build imposes structure that deliberately constrains individuals.
Pluralism vs Coherence: I value a diversity of worldviews and voices. But I also push for coherent systems and a shared set of fundamental values that allow us to function as a society.
There were a lot of others. But it shows that LLMs can look at values in an individual and show where they clash, which is enough for now.
There is also the potential to surface the relative weighing of values in a more structured way. One tool that I’m working on now is looking at commonly clashing values pairs, like privacy and national security, and detecting trends in language to see where the mood of the political class is leaning. It’s a work in progress.
Aggregating shared values
Finally, AI offers the opportunity to facilitate the aggregation of values. It presents the opportunity to discover the named values that a population asserts to hold by extracting, from media or direct input, the beliefs of individuals. These can then be examined across the population to find what proportion of the population holds a particular value.
It also has the capability to describe what individuals actually mean when they refer to a value. From there it can find the fundamental elements of a value within a cultural context as well as the areas where a definition is contested.
Recently, I’ve been working on analysing Australian parliamentary transcripts (Hansard) to extract references to values during debates. Hansard records are ideal for this because they are wonderfully structured in an XML format with all the metadata and because politicians speak in a pattern of value → goal → policy when debating. It means that you can build up the consensus values of a population, in this case politicians, based on the language they’re using.
It’s a work in progress, but you can read about the project on my website. There you’ll find the 14 values categories it settled on and aggregated definitions for each of them. The method works, at least as a proof of concept.
Eventually, I’d like to be able to extract the values statements of each individual politician and track their trade-offs. You can expect a write-up of the whole project in a few months.
Conclusion
I’ve ranged across a lot of ground today. It started with an assertion that I can author my own values, before extending it to being a civic principle. I showed how we’ve authored values in the past and pointed out that systems will do the authoring for us if we leave it to them. I put out my own preference for collective authorship before finally examining some of the ways that AI might be able to help us to author our individual and collective values going forward.
Our values aren’t authored once and left as deterministic rules. They’re constantly rebalanced through trade-offs and refinements as culture shifts. For decades, we’ve struggled to meet in the values commons, with traditional methods of authorship breaking down or being captured by narrow interests.
AI presents a new possibility. It can surface individual values, map how we rank them when they clash, and aggregate shared principles across whole communities. Done well, it could act less like a priest or technocrat telling us what we believe and more like a mirror. In that sense, AI could help us recover something we’ve lost: a living, participatory authorship of values at human scale.
I’ll be posting about values articulation in two weeks, describing the methods we have for communicating values to each other and to systems. But next week I’ll be releasing my Alignment Chart for Those-Who-Have-Seen-the-Insanity-of-the-System-and-Responded-as-Best-They-Can for something different.