LLMs learn things only in one direction!

In this post you will read about our workshop on LLM Evaluation, an interesting observation about how LLMs learn, what it takes to introduce change in your life / work, and more!

Nov 12, 2023

In my substack, deep random thoughts, I share a randomly selected set of my writing and updates every week. My posts will be related to LLMs (AI in general), product dev and UX, health (founders’ flavor), startup related topics, and of course the events we run!

Asks and Announcements

One Day Workshop on LLM Evaluation - add to your calendar

add to your calendar: https://lu.ma/llm-eval

Last week at Aggregate Intellect!

We have been looking for mid-market service businesses as our primary commercial target for a while now. So, this week I did something completely out of my comfort zone. I went to a “manufacturing” trade show, and talked to people about their AI needs, and interestingly found a half dozen decent leads. We will see if any of those will convert, but it was quite an interesting experience.

Last week’s unsung hero

I would not have done the above without encouragement from our advisor and investor, Darryl. Thank you, Darryl, for pushing me to do the right thing for my business.

LLM Stuff

I started this week by speaking at Analytics Vidhya Data Hour about LLM multi-agent systems, and how they will impact knowledge discovery and innovation.

You can watch the recording here.

One of the new things I spoke about was the comparison of autonomy using LLM agents and autonomy we’ve seen historically in the context of driving. I will be writing a more comprehensive series on this topic soon, but for now, hopefully this table would give you a sense of how I think about it:

**Autonomy in LLM multi agent systems versus autonomous driving**

If I teach an LLM "A is related to B", can it then tell me how "B is related to A"?

We discussed a paper that takes a deep look into this question at our journal club and saw some interesting nuances!

CONTEXT:

In a study published in September 2023, researchers have uncovered a startling limitation in large language models: "Reversal Curse".

🧐 What's the Reversal Curse?

When an LLM is trained on phrases like "A is B," it turns out that it won't automatically grasp the reverse, "B is A." For example, if a model learns "Olaf Scholz was the ninth Chancellor of Germany," it won't readily answer, "Who was the ninth Chancellor of Germany?" with "Olaf Scholz." In fact, its likelihood to respond correctly is no better than random chance.

🤔 Why Does This Matter?

This reveals a fundamental issue in logical deduction within these models. If "A is B" is true, it logically follows that "B is A," but LLMs seem to struggle with this basic inference. This limitation persists, even if the model comprehends logical deduction. It also extends to real-world scenarios, where LLMs perform better on questions like "Who is Tom Cruise's mother?" than on the reverse, "Who is Mary Lee Pfeiffer's son?"

📊 What Did the Experiments Show?

The study tested different LLMs, including GPT-3 and Llama, under various conditions. When the order of information matches their training data, the models excel, achieving accuracy rates around 96.7%. But when the order is reversed, accuracy plummets to near-zero, as if the models are guessing randomly.

🤯 The Big Question

Why do LLMs suffer from the Reversal Curse? The paper hints at future research to explain this phenomenon. One theory suggests that the gradient update process during training may be myopic, focusing on the immediate context rather than creating a holistic understanding.

This research highlights the complexities and challenges in training large language models, and the dangers of blindly and superficially assume emergent capabilities without proper investigation

Paper: https://arxiv.org/pdf/2309.12288v1.pdf

GitHub: https://github.com/lukasberglund/reversal_curse

Join our weekly discussions: https://lu.ma/llm-journal-club

In our recent LLM workshop Monish Gandhi discussed the practical applications, impact, and ROI of generative AI. He highlighted two projects where generative AI played a significant role and discussed the impact and ROI of using LLMs in various business contexts. He also discussed the decision between using third-party APIs or hosting one's own Llms, the importance of maintaining flexibility in architecture, and the different paths to achieving ROI. He concluded by emphasizing the need for clarity of purpose, alignment of objectives, and a growth mindset within organizations.

Topics:
———
✳️ Practical Applications of Generative AI
* Generative AI can be used to generate design assets without the need for a designer.
* Generative AI can automate processes and create new products in various industries.
* Using Llms to label data for task-specific models can lead to significant improvements.

✳️ Impact and ROI of Generative AI
* Using generative AI and Llms generally has a positive impact but requires investment in building capabilities and understanding.
* Llms may not be robust in certain cases, requiring increased investment in data quality.
* Implementing generative AI technologies has upstream and downstream impacts beyond the initial investment.

✳️ Decision between Third-Party APIs and Hosting Own LLMs
* Hosting one's own Llms provides greater control, predictability, and flexibility for model improvement.
* Using Llms for NLP tasks reduces development effort, lowers costs, and speeds up time to market.
* Hosting own Llms may increase competition and reduce differentiation among companies.

✳️ Maintaining Flexibility in Architecture
* A modular approach for Llms models and APIs helps maintain flexibility without adding unnecessary complexity.
* Consider both incremental ROI and new threats/opportunities when designing architecture.

✳️ Achieving ROI with Emerging Technologies
* Alignment of objectives and problem-solving are important for achieving ROI.
* Consider desirability, feasibility, and viability when prioritizing projects.
* Build solutions that generate feedback and improve over time.

✳️ Cultural Factors and Cost of POCs
* Cultural factors such as resistance to change and willingness to experiment can affect the cost of POCs.
* While the cost of POCs has decreased, the risk of failure remains.
* Addressing cultural issues is essential for successful implementation of emerging technologies.

✳️ Adapting to the Changing Landscape
* Clarity of purpose, alignment of objectives, and a growth mindset are crucial for success with emerging technologies.
* Consider qualitative aspects of ROI and focus on human qualities, domain knowledge, and high-quality data.

Open AI Dev Day

Well, the big topic of the week was this. I’m sure you’re sick of it at this point because every influencer and news outlet has written about it to death.

I’m personally excited about lower API cost, new assistants features, and the larger context windows! the rest is detail for me.

The case for holding off on generative AI

This week I came across this article that really resonates judging from all the conversations I’ve had with business owners and stake holders. Everyone kinda wants to jump from nothing to everything and that’s just not how it works!

Here’s Sherpa’s take on this:

The key takeaways from the article are:
1) Adoption of generative AI in the cloud is slower than expected due to a lack of skilled professionals, data quality issues, and policy concerns.
2) It is recommended to wait until all the pieces are in place before launching expensive projects.
3) It may take up to four years to see the full value of generative AI in the cloud.

Startup Stuff

At our "founders' health" session we discussed quite a few topics, including why startups fail.

The following is my take, curious to hear yours!

Why do Startups Fail?

This is obviously a very complex and nuanced problem that requires work well beyond the scope of this post, and my expertise. But it is probably fair to say that what startup failure cases have in common is that they happen when the founder gives up!

Almost all the necessary factors of success in a startup are things that can be acquired and learned. Of course this is a broad statement and is assuming that one has the initial resources and privilege to even give it a try. There are many people from underestimated communities that don’t even get a chance to play this game. But those who do, their companies go under when they give up. But why do they give up?

🤯 Because they are so stressed and sleep deprived that their cognitive abilities, say memory or critical thinking, are underperforming, resulting in accumulation of poor decisions to the point of no return.

🤯 Because they “don’t have time to eat well or exercise enough” and therefore their energy level is as volatile as markets and therefore can’t get all their urgent and important tasks done.

🤯 Because they think they have to be tough and unforgiving, and therefore end up having no time to slow down and calm their brains, and therefore make lots of emotionally charged and anxiety driven decisions.

🤯 Because they are so focused on their “project” that they forget to build meaningful relationships with people who might become their co-founders, employees, investors, customers, and partners.

🤯 And of course, perhaps the most obvious, they mess up their personal finances to the point where continuing to work on the startup stops being viable. Note that this is the case that the company might even have money but it is over-committed to resources other than the founder themselves.

My point here is that there are many ways that the founder might end up at a point of no return. But I am ready to bet that if you come up with any other reasons, it would fall under these 5 broad categories of issues affecting the founders themselves: cognitive, physical fitness, mental, social, or financial.

What am I missing?

What is the role of humor in business?

I recently attended a virtual summit targeted at sales people. It was very very different than the usual quick and dry academic style events I’m used to. There was flash, there were bells and whistles, and importantly there was lots of humor!

It reminded me of a podcast I was listening to recently about this topic. Some of the reasons I am considering to incorporate more humor in my business interactions are these:

If done right, it does show confidence and being at ease which would convey to the counterpart that this is a safe space.
It could demonstrate my understanding of the person’s context if the humor is relatable for them which could help in building trust.
It can leverage playfulness and surprise which is generally a good way to capture the audience’s attention.

Do you use humor in your business interactions? how?

Health Stuff

In the past few months, I’ve lost approximately 20 lbs!

About a year before the pandemic lockdown, I picked up a sports related knee injury. It was a blow to my fitness because in the few years before that I had recovered from the extra pounds I picked up at the end of my PhD, and I was regularly doing a couple of half marathon races every year. This injury was the last straw after some emerging issues though. I had gained weight and was generally not doing great emotionally. I was unhappy with my job and in a search for what would be the next move, and kinda sorta playing around with the idea of the startup.

I finally got over myself and started doing pilates every day, and continued for about 6-7 months. And finally made the decision that I was done with my search and I wanted to be a full time founder. So, I quit my corporate job and went full time on aggregate intellect. Everything was great, right? Well, yes… except that this all was happening in early 2020! So, the pandemic lockdown completely fucked up my progress!

The next few years became a mix of anxiety, stress, lack of sleep, and isolation. I went to the brink of giving up on the company, and severely damaged many personal relationships, and the only thing I got very good at was overeating.

Just over two years ago, I started the “founders hike” group that you might have seen in an attempt to control this unwieldy situation. In many ways this initiative did wonders for me (and apparently all the other people who have since joined). The weekly physical and emotional vent, and the lockdown winding up finally started to really help me recover mentally and emotionally. But physically, I continued being a mess: sleep problems, anxiety driven eating, lack of activity due to knees not keeping up, … you name it!

This continued until my 40th birthday earlier this year. I don’t know if it was a mid-life crisis or what, but I spent a lot of time thinking about my choices, and especially noticing many of the everyday choices like yelling at the poor bastard who cut me off on the street as I was biking on the bike lane. Or the time that I stepped into the intersection late at night and a stupid driver almost ran me over. It was very clear that what I was doing was not sustainable.

I started talking about having a group of founders who meet every week and we would talk about our health: social, mental, cognitive, and physical. I was convinced that working on any of these alone would not suffice and I had to think and adjust all of these. I made many adjustments to how I lived and worked and I will write about those soon. But one of the favors I did to myself was picking up this book that was on my reading list for a few years!

In this book, Nigel Marsh writes about a similar period of time in his life where he chooses to go from a cushy corporate job to being a house husband for a year while he writes the book. I do strongly recommend that you read this book yourself, but in order to spoil the ending for you, here are the 12 lessons he wraps up the book with:

One year is a surprisingly short amount of time to get the dominant mindset you have out of your system and change your ways. But it would be a good opportunity if you could use a year to get closer to the fact that “not having to pretend is a huge blessing”.
You don’t have a lot of time to spend with people you love, so if you’re not present during the limited time you have, then you’re missing out.
If you want to make meaningful permanent change in your life, you have to be careful about where your motivation comes from (eg. internal vs external).
The only change that matters is the long-term, sustainable, changes. “Any buffoon can effect short-term ephemeral change.”
Setting goals and reaching them only buys you a moment of happiness. Meaningful change is an ongoing process.
Change with the goal of reaching perfection is a fool’s errand. Meaningful sustainable progress is the only relevant benchmark.
Once you are old enough, you probably know all the right answers. The trick is implementing them properly.
Working out how to live a more balanced authentic life is a complex, slow, and nonlinear process. Buckle up for some hard work.
It is ok not to chase mainstream society values, and instead look for what genuinely makes you happy and allows you to be yourself.
Finding a place where you can work and live as your authentic self is the greatest privilege in the world.
Real life is lived in the gray areas.
It is ok to take a break from the hamster wheel, reflect, and use that to decide how you want to be in the hamster wheel if you ever go back to it.

These are some lessons from the book, but also things that I learned in my past few months of reform, and now I’m happy to report that I’m forty, fit(ter), and fired up!

Podcasts

Here are a few interesting podcasts I listened to this week:

Deep Random Thoughts

Discussion about this post