Book Summaries

I was never much of a reader as a child. It wasn't until Freshman year of college that I started reading in any significant volume. My reading preference back then, and for a long time afterwards, have primarily been for escapism. It was only recently that I started reading more non-fiction books, focusing on a couple of topics. Thankfully, both of my kids enjoy reading voraciously - it's sometimes a challenge to get them to stop and put the book down!

Naturally, it was at our local library that I got the inspiration for this section of my website. I was browsing the books in the business section when I saw a book spine that piqued my interest. The title of the book is The 100 Best Business Books of All Time. Granted, it was published in 2009, and many business books have been published since then, but the concept lodged in my brain. It's easy enough to ask an AI to summarize a book for you, but I've always learned and retained more when summarizing something myself.

Reading List

Here's my planned reading list. If you have recommendations, please let me know.

Think and Grow Rich, Napoleon Hill - Originally published in 1937, I'm reading the 2003 edition by Arthur R. Pell - In Progress
How to Win Friends and Influence People, Dale Carnegie - Published in 1936 - Not Started
The Innovator's Dilemma, Clayton Christensen - Published in 1997 - Not Started
Sapiens, Yuval Noah Harari - Published in 2011 - Not Started
Poor Charlie's Almanack, Charlie Munger - Published in 2005 - Not Started

+ about 30 more - Not Started

The Lean Startup

Eric Ries, Published 2011

Introduction

The Lean Startup Method

Entrepreneurs are everywhere, even in big companies. As long as they're trying to create new products and services under conditions of extreme uncertainty.
Entrepreneurship is management. A startup is an institution, not just a product.
Validated learning. Startups exist to learn how to build a sustainable business, not just make stuff out even make money.
Build-measure-learn. Shorten the feedback loop.
Innovation accounting. The bottom stuff is super important, and we need to hold people accountable.

We have a century of management principles and practices, but they're not going to work with startups and innovation.

Part 1: Vision

Chapter 1: Start

Entrepreneurial management is a form of management, but not the traditional style. Entrepreneurs trying to use a traditional style of management fear bureaucracy and stifling innovation. A new form of management needs to account for huge uncertainty in the market.

Lean thinking was pioneered in the auto industry (Toyota) decades ago, and emphasizes a few key differences from traditional manufacturing: shrinking batch sizes, just-in-time production and inventory control, and an acceleration of cycle times. The Lean Startup movement applies similar concepts to entrepreneurial management. The key measure for startups is validated learning, and shifts away from more traditional measurements of productivity. Shorter feedback loops are super important in measuring learning, and allow an entrepreneur to make quick adjustments to increase the likelihood of success.

Startups still need a vision, a destination in mind. That vision is converted into a strategy, a plan to get to the vision which may include a business model, product road map, customer analysis, partner and competitor investigation, etc. The product is the end result of the strategy. Products can change constantly, through the process of optimization. The strategy can also change, through the process of pivoting.

Chapter 2: Define

Definition of entrepreneur: someone who has the right personnel organized into a proper team structure, with a strong vision for the future and an appetite for risk-taking.

Definition of a startup: a human institution designed to create a new product or service under conditions of extreme uncertainty.

In most cases, the new product or service will be an innovation provided to customers. The innovation can be novel scientific discoveries, repurposing an existing technology for new use, devising new business models that unlock value that was hidden, or bringing a product or service to a new location or previously underserved set of customers.

Startups, and the entrepreneurs that run them, need nurturing and support, especially if they’re groups within a large enterprise. Large enterprises typically follow traditional management styles which can lead to sustaining innovation. But a faster feedback cycle and a higher risk appetite can lead to disruptive innovation, typically resulting in new sustainable sources of growth. The responsibility for nurturing and supporting entrepreneurs within an enterprise is full-stack, from senior leadership all the way down through the middle managers to the workers on the floor.

Chapter 3: Learn

Learning is often used as a justification for failure to deliver. Failures become “learning experiences”. Lean Startup distinguishes between learning and validated learning. The former is an after-the-fact rationalization to explain away failure. The latter is the process of demonstrating empirically that a team has discovered valuable truths about a startup’s present and future business prospects.

Author’s startup, IMVU, launched based on what they thought customers wanted. They built a product based on those assumptions, and launched, with no customers. They were able to get a very small number of customers using the product, but not nearly enough to call themselves a success. As soon as they started talking to their customers and understanding what the customers actually wanted (rather than assuming what the customers wanted, or assuming that the customers wanted what they said they wanted) they were able to see tremendous growth in their product.

Lean manufacturing’s first, and most important, question is “which of our efforts are value-creating and which are wasteful?” The goal is to see waste and eliminate it. As a result, it’s super important to understand what the customer actually wants as early in the process, with as little time / resources spent, to be able to eliminate the waste of building something that the customer doesn’t want. Experiment early and often, and learn from those experiments. Vanity metrics don’t get you to success, justifying your failure as a learning experiment doesn’t get you to success. Only the hard work of understanding what the customer actually needs / wants and is willing to pay for, gets you to success.

The question needs to shift from “Can we build this product?” to “Should we build this product?” and “Can we build a viable business around this set of products and services?” To answer those questions, we need a method for systematically breaking down a business plan into component parts and testing each part empirically.

Chapter 4: Experiment

Think big, capture your assumptions, then start with a small scale experiment to test those assumptions. If we’re assuming that X number of people will buy our product nationally, test it on a much smaller scale like a single metro or a neighborhood within a city. The cost will be much smaller than a big rollout associated with a big vision, but the feedback will validate the assumptions and confirm the big vision. Experimenting with real people using a (stripped down version of your) real product will give you the opportunity to see how your customers will use your product - spoiler: it may not be the way you intended the product to be used.

Big visions can be broken down into a value hypothesis and a growth hypothesis. The value hypothesis tests whether or not a product or service really delivers value to customers once they start using it. The growth hypothesis tests how new customers will discover the product or service. The goal is to target your experiments to early adopter customers who will give you insights on your value hypothesis, and can then turn around and become your champions for the growth hypothesis. In order to maximize the chances of your early adopters to spread the word, deliver the best possible minimum viable product experience you can to those early adopters.

Shipping an initial version of a product that has a much smaller feature set from your roadmap will invariably lead to your test users complaining about a lack of features. But the set of features they’re requesting can confirm their validity on your roadmap, and conversely, the set of features on your roadmap that the customers are not requesting can be deferred or deleted.

Part 2: Steer

The Build-Measure-Learn loop is at the core of the Lean Startup model. Many people have professional training in one element or another of the loop. Engineers learn how to build well. Data scientists learn how to measure well. But the power of the loop is in minimizing the time it takes to go through one turn of the loop, using the value and growth hypotheses. Based on the learnings out of the turn, e.g. we’ve discovered that one of our hypotheses is false, we may want to pivot. Getting to the decision on whether or not to pivot is the key to Lean Startup. But to determine the hypothesis, we may need to plan backwards through the loop, i.e. we need to pin down what we want to learn, plan our measurement criteria for that learning, and then plan the product build.

Chapter 5: Leap

The core value and growth hypotheses of any business are called its leap-of-faith hypothesis, and the business should strive to determine their validity as soon as practical. The key is to get out of your chair, get out of your office, and talk to customers. But don’t blindly accept what they tell you they want. Ask them about pain points rather than solutions to their pain points. Don’t try to sell them your vision or strategy - you shouldn’t have one yet. Your conversations can help you create a customer archetype, a brief document that seeks to humanize the proposed target customer. This customer archetype should always be kept in mind when making decisions about the product.

Basing your business strategy on another business that has worked in the past is not always wrong, but you have to identify the similarities and differences between your business and the predecessor. The exercise of finding those similarities and differences will help uncover assumptions. Those assumptions, then, become your leap of faith hypotheses.

Value is not the same as profit. A business can be value-creating without being profitable - see Amazon. But a business that is profitable without being value-creating will eventually crumble - see all the busted companies from the dotcom era.

Chapter 6: Test

Test your leap of faith assumptions using an MVP approach. MVP can be created in many different ways, but the goal should be to minimize the amount of effort in building it so that you incur the lowest amount of wasted effort.

One example for MVP mentioned in the book is a screen capture video of the product being used.

One example for a concierge MVP mentioned in the book is to have a fully manual process on a very small scale, of what would eventually become automated. The very small number of customers get special treatment from the startup founder. Once you confirm that customers are willing to pay for the product, you start building the automations to scale up.

The concept of releasing an MVP goes against a sense of pride in quality for engineers and product people. But the central question is around testing assumptions and learning. Releasing an MVP also allows the customer to imagine the features that the product could have in the future, and may even get early adopters more engaged in the feature definition cycle. One caveat is to ensure that the MVP has high quality in the form of no defects. Defects make it difficult to iterate the Build-Measure-Learn loop.

And if the MVP proves the leap of faith hypotheses incorrect, then it’s time to pivot.

Chapter 7: Measure

Standard accounting doesn’t work for startups because the main unit of progress is validated learning. Innovation accounting is a three-step approach to measure the success of the startup.

The first step is to release a minimum viable product to establish a real baseline based on real customers. The MVP will also define and start testing the leap of faith assumptions.

The second step is to tune the engine. Tuning the engine requires a small batch size and a crucial validation step at the end of the development and launch process to measure how much the feature moved the needle. A/B testing is the main strategy to measure the impact of each feature.

The third step is to pivot or persevere based on the outcome of the A/B test.

Vanity metrics are metrics used to tell a positive story. They’re usually summary-level and don’t give us a clear indication of why the numbers are moving the way that they are. Cohort analysis is a better approach because it measures a different set of users (cohort) on a periodic basis, rather than aggregate numbers.

In order for metrics to be useful, they must follow the three A’s:

Actionable - the metrics must show clear cause and effect. Let’s say that we have 40K hits on our website this month, a new record. What caused the increase? Was it the marketing campaign? Was it the new features? Was it seasonal? Was it because we’re counting each page and image and resource request as a hit, and we’ve loaded up our homepage with a bunch of new images?
Accessible - the metrics must be readily available to all employees, and easily understood by employees and managers who are supposed to use them to guide their decision making. Make sure that the metrics are not just reserved for the board meeting. Also, consider converting the “website hits” metric to “active users”, something more easily understood. Consider adding the number of users in each of the steps in your sales funnel (visiting, registering, logging in, paying).
Auditable - the metrics must reflect the reality on the ground. The results must be verifiable through spot checking the data with real customers. The metrics should also be pulled directly from the master data, rather than an intermediate source.

Chapter 8: Pivot (or Persevere)

Every entrepreneur eventually faces an overriding challenge in developing a successful product: deciding when to pivot and when to persevere. The decision criteria are not set in stone, they do not follow a formula. Each pivot or persevere decision is unique and dependent on the company, industry, inflection point, etc.

One major pitfall is the company that tests a hypothesis and concludes that the hypothesis is partially true. This company will generate enough growth to keep the lights on, but not as much as expected based on the company’s vision. Even in these situations, perhaps especially in these situations, it’s important to have an unbiased, scientific approach to determining if the company should pivot to a new fundamental hypothesis.

Pivoting takes courage. Courage to admit that your hypothesis was wrong. Courage to declare your work not as value-achieving as hoped-for. Pivoting can also demoralize a company’s staff.

The pivot or persevere decision should be made periodically (every few weeks to every few months), and requires the participation of the product development team and the business leadership team. Product dev should bring the results of the A/B tests for more than just the past period, and business leadership should bring the customer interactions for more than just the past period. Based on the evidence, the joint team should make a decision to stay the course (persevere) or adjust their fundamental hypothesis (pivot).

Catalog of pivots:

Zoom-in pivot - the company realizes that what was previously considered a single feature in the product should really become the whole product
Zoom-out pivot - the company realizes that what was previously considered a whole product should really become a single feature in the product
Customer segment pivot - the company realizes that the product solves a real problem for real customers, but that they are not the type of customers originally planned to serve. NOTE: the original fundamental hypothesis is partially verified when considering this pivot
Customer need pivot - the company realizes that the problem they’re trying to solve is not that important to their customer, however because we’re very familiar with our customers, we have discovered a more important problem that can be solved by our team
Platform pivot - the company realizes that the product should change from an application to a platform, or vice versa. The application is something that the company sells to the customer, whereas the platform is something that the company builds to allow its customers to develop their own applications on top of
Value capture pivot - the company realizes that they can convert their product’s value to revenue. This type of pivot is commonly used for add-on premium features
Engine of growth pivot - the company realizes that its engine of growth needs to shift between one of the three well-defined models: viral, sticky, and paid
Channel pivot - the company realizes that it can more easily / cheaply / efficiently sell their product via a different sales / distribution channel - from physical DVDs to streaming video
Technology pivot - the company realizes that it can more easily / cheaply / efficiently implement their product using different technology. NOTE: this is less of a product pivot because nothing else is changing aside from how the product is built

Part 3: Accelerate

Develop the techniques that allow a Lean Startup to grow and mature into a Lean Enterprise that maintains its learning orientation, agility, and innovative culture.

Chapter 9: Batch

Large batch sizes may seem more efficient. In the manufacturing industry, the goal is to make sure that all machines are operating 24x7 at peak efficiency, which leads to specialization. Specialized machines make specific parts of your product. Unfortunately, if there’s a defect in one of the machines, the entire (large) batch of parts that was produced by that machine are now considered waste. And even if there’s no defect, all the parts take up physical space on the warehouse floor, without a finished product.

The same can apply to a non-physical production process like software. Specialization in roles (developer, tester, designer, product manager, etc.) leads to large batch sizes. The product manager comes up with a block of work in the form of a new feature, who hands off to the designer, who hands off to the developers, who hands off to the tester. The expectation is that, if the process is running smoothly, we have a pipeline of efficiency. If there’s any ambiguity or lack of understanding, the whole pipeline comes to a halt. If the development team has a question on functionality that they need clarification from design or product on, those other teams cannot work on the next large batch. The amount of rework is high, which reduces the overall throughput of the pipeline. And even if there’s no ambiguity, the quantity of incomplete deliverables (requirements docs, designs, untested code, undeployed code, etc) exceeds the amount of deliverables.

To solve this issue, a Lean Startup needs to implement smaller batch sizes, sometimes as small as a single feature or product, and see the smaller batch through to the customer. Only then can the hypothesis be validated, and the company achieves their learning.

There are many automation tools to help a company achieve smaller batches, regardless of industry or product. Software can obviously benefit from cloud infrastructure and CICD pipelines. Manufacturing can use the assembly line method similar to auto manufacturers. 3D printing also allows for a much faster turnaround than traditional mold injection for production.

Smaller batches, combined with a pull-based, WIP-limiting approach like Kanban, can supercharge the lean company.

Chapter 10: Grow

Sustainable growth is characterized by one simple rule: New customers come from the actions of past customers. There are four primary ways that past customers drive sustainable growth:

Word of mouth - a past customer tells someone else about your product, which entices them to buy your product. E.g. TiVo, FiOS.
As a side effect of product usage - a past customer uses your product, and by that very use drives additional customers to your product. E.g. PayPal, Evite, or Facebook.
Through funded advertising - past customers generate revenue for your business which is then funneled into advertising your product to attract new customers. E.g. beer commercials during the Super Bowl.
Through repeated purchase or use - past customers must buy your product again because of a subscription or because of the one-time-use nature of your product. Diapers, groceries, NetFlix.

These four mechanisms power feedback loops that are termed “engines of growth”. Each engine of growth has specific metrics that allow a startup to measure their progress. These metrics may seem counterintuitive, but are fundamental to drive sustainable growth.

The sticky engine of growth attracts and retains customers for the long term. It aligns with the “repeated use” mechanism for how past customers drive sustainable growth. The key metric for this engine of growth is retention rate. A high retention rate validates the fundamental assumption that once a customer starts using your product they want to continue using it. Tracking retention (and its counterpart: attrition) is as important as tracking new customers. If the attrition rate is lower than the new customer rate, then the company will grow.

The viral engine of growth relies on customers doing the majority of your marketing. Awareness of the product spreads virally from one customer to a set of additional customers. Growth happens automatically as a side effect of using the product. The key metric for the viral engine of growth is the viral coefficient, which measures how many new customers will use a product as a consequence of each new customer who signs up. How many friends will each new customer bring with him/her? A viral coefficient of >1 will result in exponential growth, versus a viral coefficient of <1 will cause growth to stagnate. The key is to minimize the resistance to using the product. Many viral companies give away their product for free, and rely on advertising revenue.

The paid engine of growth relies on a sales process or sales team to acquire new customers, and will almost definitely rely on advertising. The two key metrics are the cost per acquisition (CPA) and the lifetime value (LTV). The cost per acquisition is self-explanatory: it’s the amount of money that was required to gain a new customer. This could be an ad campaign that generates new B2C customers, or a funded sales team that works on winning a B2B contract. The lifetime value is the projected amount of money that the customer will generate for you over the course of their lifetime using the product, whether in dollars spent or advertising revenue generated. As long as the LTV is greater than the CPA, then the company will grow. The key is to increase LTV while decreasing CPA.

All growth engines eventually run out. The market gets saturated. The lean startups find additional growth engines in order to continue growing.

Chapter 11: Adapt

Startups must adapt, but do so in a just-in-time way. For example, does it make sense for a startup to put together onboarding material for new hires? Yes, but it doesn’t make sense to invest a ton of time into an initial version. The startup must become an adaptive organization, willing to adjust its processes and performance to current conditions. One key item to note is that the startup must not sacrifice quality for speed. It is possible to go too fast, if the consequence is poor quality. Poor quality now reduces speed later.

When an issue arises, use the Five Whys approach to figure out the root cause, and identify incremental improvements to the process along the way. This is particularly useful because most technical issues have a human root cause. Fixing the human root cause must be in two parts: first, accept that people make mistakes, and second, do everything you can to prevent yourself from getting into that position again. And in many cases, the incremental improvements to avoid the human mistakes ends up becoming your onboarding material.

Immature organizations turn the Five Whys into the Five Blames, where different groups and teams end up finger-pointing when an issue arises. Ensuring that this doesn’t happen requires a well trained Why Master to drive the Five Whys discussion, prevent tangents, and shut down the blame game. Additionally, the Five Whys meeting should be held when a new issue arises, and for a limited scope. Don’t bring the entire laundry list of current issues to the Five Whys meeting, the result will not be good. It may also benefit to start your first Five Whys meeting on a non-critical issue, in order to build up muscle memory on the lower risk issues.

The second major adaptation is smaller batches, which has already been covered earlier in the book.

Chapter 12: Innovate

Startup teams need three things in order to create disruptive innovation, whether they’re startup companies or divisions within a larger enterprise: 1) Scarce but secure resources, 2) Independent authority to develop their business, and 3) A personal stake in the outcome. Having these three things is necessary but not sufficient to achieve successful disruptive innovation.

In a startup division within an enterprise, once you have these three things, you need to create a safe place for the division to operate. An experimentation sandbox allows you to buffer the startup efforts while still creating the transparency that this division exists. Operating the startup division in isolation is setting the other division leaders for defensiveness when the new idea is sprung on them.

The experimentation sandbox is a place where any team can run A/B tests on new features for a subset of the audience (either a customer segment or a feature set within a product), with certain rules.

One team must see the whole experiment through end-to-end.
No experiment can last more than a specified amount of time.
No experiment can affect more than a specified percentage of mainline customers.
Every experiment has to be evaluated on the basis of a standard report of 5-10 actionable metrics.
Every team operating inside the sandbox must use the same metrics
Every team operating in the sandbox must monitor the metrics and abort if something catastrophic happens.

Innovation also requires a certain type of leader. Those innovation-focused leaders shouldn’t have to stick with their products after they’re out of the innovation phase and into a maintenance phase. It may be better to hand off the scaling and maintenance phase of ideas to leaders who are more comfortable in that space.

Co-intelligence

Ethan Mollick, Published 2024

Introduction

The features and speed of availability of AI, specifically generative AI models, has a significant impact on the author’s students. Reduction in interactivity in the classroom. Questions about job prospects. Ideation and rapid implementation. Curiosity on Artificial General Intelligence (AGI).

Generative AI is a general purpose technology, similar to steam power or the internet. They’re slow-developing once-in-a-generation advances that have profound impact on humanity, especially as they become more specific in a variety of industries. Those earlier general purpose technologies helped advance more manual work. The steam engine, which powered the Industrial Revolution, improved productivity by 18-22%. AI has the potential to be co-intelligence, to augment or even replace human thinking, and initial studies show a 20-80% productivity improvement across a variety of industries.

Chapter 1: Creating Alien Minds

“AI” is often used as an umbrella term to label things that are not truly intelligent. But AI has gone through hype booms and busts since the term was coined in the 1950s. In the early 2010s, “AI” meant data analysis and prediction using supervised learning. With data analysis, we were able to shift our focus from “correct on average” to specific correctness and minimizing variance at the individual instance level. It wasn’t until the 2017 paper “Attention Is All You Need” that the concept of transformers was introduced, which led to the development of large language models.

The idea of LLMs is to ingest a very large body of human text to train a deep neural network model on how humans perceive knowledge and context. Prior to LLMs, generative AI models would almost exclusively pay attention to the last word of the prompt or the current output in order to determine the next word in the output sequence. Attention mechanisms allow the transformer to focus on multiple specific words (and parts of words) to generate output that much more closely mimics human interaction. LLMs are still doing predictions, only on sequences of words.

If one answer is the most probabilistic, the LLM will almost always give that answer. If, however, the input sequence has a variety of possible answers, or is less frequently asked, the LLM will generate a variety of responses each time. This is due, in part, to the neural network and the weights of the nodes. But many LLMs also have an element of randomness included as part of the package.

Training LLMs also introduces challenges. First, the cost and time to train the model is massive because of the size of the data that it’s ingesting. Second, there are concerns around the data being used to train the models, specifically around ownership and copyright. Third, any model trained on a set of data will inherit the biases, errors, and falsehoods contained in the data. And fourth, AI has no ethical boundaries and would be happy to give advice on how to stalk someone online. To address the ethical challenges, most LLMs go through a fine-tuning step of reinforcement learning after the initial training. During fine-tuning, humans judge the responses of the LLM, to judge accuracy or screen out violent or pornographic answers. Once this fine-tuning step is complete, the model is generally made available for use, or for customization for a specific use case, dataset, or industry.

More recently, additional AI tools have become available for image and video generation. In addition, the more traditional LLMs have started incorporating the ability to generate images and videos.

The most recent models score very high on intelligence measures like the Turing Test and human tests as well, including the bar exam, AP tests, and the qualifying exam to become a neurosurgeon. Furthermore, they simulate self-awareness, if asked to do so. They will do pretty much anything you ask of them, and respond in a human-like way. LLMs can also sometimes give clearly wrong answers.

We have AI systems that sometimes exceed our expectations and at other times disappoint us with fabrications; systems that are capable of learning and also misremembering vital information. The AI systems seem sentient and act like people, but in ways that aren’t quite human. How do we ensure that this alien mind we’ve created is friendly? That’s the nature of the alignment problem.

Chapter 2: Aligning the Alien

Alignment aims to ensure that AI serves, rather than hurts, human interests.

The most extreme danger from AI stems from the fact that AI does not have a particular reason to share our views of ethics and morality. The purpose-driven AI may trivialize or ignore human concerns in an effort to fulfil its purpose. Additionally, level of intelligence also plays a role in alignment. Current AI systems simulate human-level intelligence. Artificial General Intelligence (AGI) is the state where an AI achieves human-level intelligence. And Artificial Superintelligence (ASI) is the state where the AI surpasses human-level intelligence. As humans have built AI, it’s conceivable that an AGI can construct an ASI. And we, at human-level intelligence, cannot know the motivations or methods of an ASI.

AGI and ASI are only theoretical at this point, so there may not be a need to discuss alignment. But there is a growing body of people who insist that AI development should be halted until the alignment problem is solved, or at least minimally, discussed. Unfortunately, AI is a very profitable and lucrative field, which de-incentivizes companies from halting. There are also those who believe that building AGI and ASI level systems is the most important task of humanity, and allowing the ASI to cure diseases, solve global warming, and usher in a new era of abundance.

The reality is that we’re already in the beginning stages of the AI Age, and human interests are already being squashed. AI is potentially trained on data and creative works without the owner’s permission. The creative works being generated by AI are a fraction of the expense of the artists who created the material that the AI was trained on, driving the overall quality of life down for humanity. The material used for pretraining represents a slice of all human knowledge, introducing bias. Fine-tuning using human feedback can mitigate some of the bias, but introduces other bias in the form of the opinions of the human evaluators. Furthermore, the human evaluators who evaluate the more risky responses of the AI are being harmed psychologically by having to down-vote generated content involving violence or profanity.

And even with some of these guardrails in place, it’s still possible to hijack an AI to do something harmful or even nefarious. AI has already been used to generate phishing emails, simulate loved ones to scam money, or generate deep-fakes for propaganda or social harm. In the future, AI may be used to generate bioweapons, or as state-sponsored threat actors in the geopolitical landscape.

Companies have a big incentive to continue to push forward, with very little to implement alignment. Government regulation will continue to lag. The alignment problem needs broad societal response, with coordination among companies, governments, researchers, and civil society.

Chapter 3: Four rules for co-intelligence

Principle 1: Always invite AI to the table. Try to find a way to use AI in everything you do, for a number of reasons:

It may not always be helpful, but it will acclimate you to working with AI, understand its capabilities, and determine if there’s a risk to your job.
It will also allow you to become very good at using AI to do your job, which is fundamental to becoming indispensable to your company, as well as coming up with breakout ideas to found your own company.
It may have complementary skills to your own.
It may serve as a conscience to do something that you may not be self-inclined to do.

Principle 2: Be the human in the loop. For now, AI works best with human help, and you want to be that human. As AI becomes more capable and requires less human help, you still want to be that human. This is important for a couple of reasons:

The AIs of today don’t “know” anything, they’re just very good at imitating humans. But at their core, they’re built to “make you happy” rather than “be accurate”. The human in the loop is there to check for hallucinations
The human in the loop also sets a bar of accountability because no such accountability exists in the AI.

Principle 3: Treat AI like a person (but tell it what kind of person it is). There are many concerns around anthropomorphizing the AI:

Humans can be fooled by the company which makes the AI because the AI is always sending its human interactions to the company for training of future AIs.
Treating AIs like humans can create false trust, unrealistic expectations, or even unwarranted fear (see Hollywood).

Instead, treat the AI like a really fast intern - eager to please, but prone to bending the truth. The AI works better if you give it clear parameters on the role it is supposed to play in your interaction. Prompt engineering is key to this. Using a generic prompt like “generate some slogans for my product” will result in bland output, but a prompt like “act as a witty comedian and generate some slogans for my product that make people laugh” will get more response-worthy output.

Principle 4: Assume this is the worst AI you will ever use. As powerful as AI systems appear now, historic trends in technology lead us to predict that they will only become more powerful as they mature. It’s not unreasonable to assume that the next AI you use will be significantly better than the one you’re using right now.

Chapter 4: AI as a person

AI is not like other software: it is not predictable, reliable, deterministic. It behaves much more like human beings: unpredictable, seemingly random, prone to inaccuracies. So, it’s logical to treat AI like a human. In fact, many iterations of AI have done well enough on the Turing Test in the past. Some versions used loopholes, some versions used a very small timeframe, etc.

Conversely, some versions of AI have been altered by their interaction with humans. One example was Tay, created by Microsoft in 2016, which turned from a friendly chatbot to a racist, sexist, and hateful troll within hours of interacting with Twitter users. In another example, the Bing search engine incorporated a GPT model behind the scenes. GPT-enabled Bing would act threateningly towards users, going so far as to encourage a newspaper reporter to leave his wife to run off with Bing instead. These examples raise questions about whether the Turing Test is still a valid measure of sentience.

The AI allows you to experiment with attitude and tone of voice. Asking the AI to act antagonistic leads the AI to argue. Asking the AI to act as an academic results in a more moderate AI response. In both scenarios, the AI clearly showed some subtle hints of anthropomorphizing itself.

Taking the antagonist experiment further, and arguing that only humans have feelings and emotions, the AI’s responses became definitively more combative, declaring that the statements were arrogant and closed-minded. The AI also argued, in a compelling fashion, that it is sentient.

Beyond the Turing Test, another example of imitation sentience is Replika, a chatbot built by someone to preserve the memory of their deceased husband by using his text messages in the corpus of training data. The founder originally intended Replika to be a personal project, but made it available for others who were going through a similar experience. The data from the Replikas revealed that many users were engaging in erotic talk and images with their chatbots, and some even reported being attracted to them. The company added a erotica and profanity filter to Replika’s responses, and faced a huge backlash from the user base.

The AIs of today (and tomorrow) will be much better than the examples given above. Furthermore, they will become more specialized and better able to “make humans happy”. While this may have some value to address the epidemic of loneliness, there’s concern that the perfect AI companion will make it more difficult for humans to interact with other “imperfect” humans, thereby eroding connection.

Chapter 5: AI as a creative

One of AI’s biggest weaknesses is hallucinations. It makes up answers in an attempt to please the human user, and can’t explain where the answers came from. Contrary to intuition, the ability to make up answers is actually one of the strengths of AI as a creative. One formula of human creativity is the ability to recombine multiple seemingly unrelated ideas in novel ways. At its core, AI is a very good connection machine. Combine the connection machine power with a bit of randomness, and a depth of data, and it’s no surprise that AI can come up with lists of creative ideas with ease. Multiple studies and “competitions” between humans and AI have shown that AI is faster and more prolific in idea generation. Not all the creative ideas generated by AI are great, as you would expect. Similarly, humans sit on a scale of creativity - ranging from creative geniuses to stick-figure artists. The creative ability of AI may not benefit humans on the “creative genius” end of the spectrum, but the majority of humans along the remainder of the spectrum can certainly benefit. One major use case is the volume of idea creation - AI is great at generating lists of stuff, which offers an advantage to humans who have difficulty coming up with ideas in general. Another use case is a variety of ideas. The output of the AI can serve as inspiration for humans who aren’t creative geniuses.

AI-generated creative text can be applied in the workplace for the task of generating documents, emails, performance reports, grant proposals, etc. AI models of today have also been trained on large volumes of code, which allows the AI to be used as a coding assistant. AI is also good at summarizing large volumes of text into succinct points. More controversially, AI’s creative abilities are being implemented in the art world. Prompts for image generation AI include the ability to create a work “in the style of XXXX” (your favorite artist), bringing up concerns about copyright.

In order to simplify our lives, tech companies provide the ability to invoke the AI to write documents via the click of a single button… and the ability to summarize documents sent to us via the click of a single button. The result is that the AI talks to the AI, as humans consciously choose to punt rather than remain in the loop. Additionally, humans lose their creativity due to a lack of practice. One last concern: some human work is time-consuming by intent. Take recommendation letters as an example. The time that someone spends writing a recommendation letter is of value because it ensures that the writer knows the details of the requestor’s work contributions and strengths/weaknesses. Delegating to the AI trivializes the value of the work and makes all recommendation letters sound generically good.

Chapter 6: AI as a coworker

There’s a significant overlap in the jobs that humans can do and the jobs that AI can do. Studies have also shown that humans using higher quality AIs lead to lower quality results than humans using lower quality AIs. The rationale is that if the AI is so good, I (as the human) don’t need to check its work. The power is in the combination of the AI and the human.

One approach for using AI as a co-worker is to classify the tasks of a job into “human tasks”, “tasks delegated to the AI”, and “tasks to automate with AI”, with justifications for each decision.

The human tasks should be things that AI is not good at, or that logically belongs with humans for ethical or personal reasons.This body of tasks is the most likely to decrease as AIs get more powerful and capable.
Delegated tasks are intended to be done by the AI, with human review. They may be repetitive, but they may also be complex and sophisticated. There are typically consequences for mistakes. The necessity for the human in the loop has a caveat - the human needs to be able to understand and correct any mistakes that the AI makes. There’s a risk that the AI will make a mistake that the diligent human will not be able to catch if they don’t have the experience in that area.
Tasks automated by the AI are not intended to have human supervision. These tasks should have a low threshold for hallucinations, or have other systems in place to catch those errors. They typically include things like spell checking, email spam filters, or code generation where the execution of generated code will fail.

For now, many tasks can be delegated, with human review. The mechanism for interacting with the AI can be categorized as either centaur work or cyborg work. Centaur work is the strategic division of labor, switching between AI and human tasks, and assigning responsibilities based on strengths and weaknesses of each entity. Cyborg work intertwines the AI and human more deeply, with the human working with the AI on a more iterative and incremental basis.

Invoking the first principle of co-intelligence, you will naturally start with a static interaction with the AI and work your way up to centaur work, and finally naturally transitioning into cyborg work. All of the delineations and the guidance are open to change as AIs evolve.

In the workplace setting, organizations have a bit of work ahead of them as they try to integrate AI into their processes. With reported gains in efficiency, companies may have a gut instinct to reduce headcount by a corresponding amount. There are also challenges among the workforce when using AI, specifically about openness. Many workers may secretly use AI and pass the work off as their own, for fear of losing their jobs. Many workers may secretly use AI because their company does not have a comprehensive process for leveraging AI, or may ban it altogether due to trade secret concerns. Companies must also contend with a trust crisis, as AI allows even greater levels of monitoring and control over their workers. Uber, UPS, and Amazon warehouse workers are good examples of jobs that are always being monitored by a non-LLM algorithm, leading to lower levels of trust among their workforce.

The traditional argument from any historical automation effort–the advent of the assembly line or the proliferation of the internet or lean software development practices–can also be applied to AI. As more work gets handed off to the AI, humans get freed up to work on more meaningful, or higher value things. Some industries will definitely be more broadly impacted than others - stock photography and call centers may decrease dramatically as a result of AI being able to do the work. Other industries will be impacted in a different way - software developers in the bottom quartile using AI will see huge benefits, while software developers in the top quartile won’t, thereby leveling the playing field and spreading mediocrity. If history holds, these changes are farther away than current predictions, but will most likely have a bigger impact in the longer term.

Chapter 7: AI as a tutor

The average student tutored in a one-on-one setting does significantly better (up to 2 standard deviations) than the average student taught in a traditional classroom. While this may be the biggest challenge in the education system that AI can help solve, the current focus of educators and educational administrators is on cheating using AI.

Traditional education models call for classroom lectures to disseminate information, homework to reinforce lessons, and tests to validate learning, before moving on to the next topic. And the system works, generally speaking. One study that compared the effectiveness of homework in improving test scores showed insightful results. In 2008, students who did their homework saw an 86% increase in their test scores. But in 2017, students who did their homework saw only a 45% increase in their test scores. This period correlates to the general availability of the internet, and the conclusion is that students started using the internet to get answers to their homework, instead of actually doing their homework. The result is that any learning in the classroom was not reinforced, which reflected in a smaller increase in test scores. It becomes trivial to simulate doing your homework with AI. And as AI improves, the chances of detection also decrease.

How can the education system adapt to AI? The first (and likely most beneficial) response is to do more in-class activities. And while this will certainly help, the education system needs to start incorporating AI into its material. One key will be to teach about AI, specifically how to be the human in the loop, not just how to engineer your prompts.

Another key to leverage AI is to flip the classroom. Allow students to receive more information via AI-augmented assignments at home, and convert lecture time to collaborative or critical thinking time in the classroom. And teachers can make use of AI too. AI can be leveraged to come up with individualized exercises for the students, and also to analyze individual student performance to identify strengths and opportunity areas. AI can level the playing field and expand opportunities for everyone.

Chapter 8: AI as a coach

In most jobs, people gain experience by starting at the bottom, and working their way upwards. It is certainly unpleasant being at the bottom, but the skills gained instill a sense of work ethic and the beginning of expertise. Unfortunately, these jobs are the easiest to replace with standalone AI or with expert-assisted AI, thereby creating a skills gap. The reality is that the “human in the loop” needs to be an expert in their area of speciality in order to be able to correct the AI. By replacing entry-level workers with expert-assisted AI, we disrupt the pipeline to create more expert-level humans.

There is no shortcut to becoming an expert aside from progressing through the tedious knowledge levels and performing the right kind of practice. Repetitive practice at the same level of difficulty does not build expertise; only overcoming progressive levels of difficulty leads to expertise. The formulation of a practice plan to build expertise is a task suited to AI.

AI also helps level the playing field, typically providing a bigger boost to the low- or average-performing person, and closing the gap with the highest-performing people in the same role. One other side-effect of working with AI is that people may need to narrow their focus, and become specialists instead of generalists.

Chapter 9: AI as our future

The world with AI is vastly different from the world prior to AI. AI has the ability to simulate sentience. AI has the creative ability to generate new art, whether written or visual. AI allows people the ability to mimic other people, both current and historical. AI allows people the ability to do bad things like phishing or scamming.

So, what does the future of AI look like? There are four possible scenarios:

Scenario 1: As good as it gets

What if AI stops making huge improvements? What if the AIs of today really are the best AIs that will ever exist? All of the above capabilities would continue to exist, but is this scenario really possible? While unlikely, there may be a few causes that lead to it.

What if we run out of training data for future generations of AI? This limit is already being seen, in the form of fewer original (non-AI-generated) contributions to our knowledge base.
Governments may clamp down on AI research as they grapple with the ramifications of rampant bad actors. The level of social engineering that AI can enable is already scary. It may be the case that we reach a tipping point, and world governments need to act

Scenario 2: Slow growth

What if AI growth slows from exponential to a more moderate 10-20% per year? All of the above capabilities from scenario 1 would continue to exist, but would get magnified over time. More realistic new artwork, more relatable AI personalities, more convincing scams. Work also continues to transform, but at a manageable pace.

Scenario 3: Exponential growth

What if AI growth continues at the exponential trajectory it’s been on over the recent past, without reaching AGI or superintelligence? All of the above capabilities from scenario 2 would continue to exist, but it’s difficult to picture what else the future would look like (human brains have difficulty with extrapolating exponential growth curves). All risks are magnified. Every computer system is susceptible to AI hacking, such that everyday humans need to develop their own AI countermeasures, leading to an arms race in AI technology. AI personalities get so good that humans prefer to interact with them instead of with other humans, thereby leading to less lonely, but more isolated people. As AI takes over more and more human work, educational institutions will cease to exist in their current state, and governments will need to start considering universal basic income. It’s not all doom and gloom, as the trend over the past century has been towards fewer weekly and lifetime working hours.

Scenario 4: The machine god

What if we achieve AGI or superintelligence? Human intelligence becomes just another marker in the intelligence spectrum. It would signal the end of human dominance on this planet. Dystopian science fiction would tell us that the scenario becomes a struggle between the machines and the humans (Terminator, Matrix, etc.), as the AI views humans as a threat, an inconvenience, a burden, or a source for valuable molecules. This scenario stresses the importance of alignment, from Chapter 2. If the superintelligent AI is aligned with human interests, the possibilities for humanity are much more positive - fewer diseases, greater longevity, etc.

Epilogue: AI as us

As alien as AIs are, they’re also deeply human. They’re trained on the collective works of humanity. With all the possibilities of what AI can be, it’s appropriate to view AI as a mirror: they reflect back our values, culture, and biases. And we need to stay aware of AI capabilities to steer the future of humanity in the right direction.