What we can learn from AI’s mistakes

AI has been making a lot of progress lately by almost any standard. It has quietly become part of our world, powering markets, websites, factories, business processes and soon our houses, our cars and everything around us. But the biggest recent successes have also come with surprising failures. Tesla impressed the world by launching a self driving car, but then crashed in cases a human would have easily handled. AlphaGo beat the human champion Go player years before most experts possible, but completely collapsed after its opponent played an unusual move.

These failures might seem baffling if we follow our intuition and think of artificial intelligence the same way we think about human intelligence. AI competes with the world’s best and then fails in seemingly simple situations. But the state of the art in artificial intelligence is different from human intelligence, and its different in a way that really matters as we start deploying in the real world. How?: machine learning doesn’t generalize as well as humans.

Tiny autonomous car running tensorflow

The two recent Tesla crashes and the AlphaGo loss highlight how this plays out in real life. Each of the Tesla crashes happened in a very unusual situation — a car stopped on the left side of a highway, a truck with a high clearance perpendicular to the highway, and a wooden stake in an unpainted highway. In the game AlphaGo lost, it fell apart when the Go champion Lee Sedol played a highly unusual move that no expert would have considered.

Why is it that AI can look so brilliant and so stupid at the same time? Well, for starters, it knows less about what’s going on then you think. Let’s look at a simple example to explain. AI can get spectacularly good at distinguishing between the use of the word “cabinet” to refer to a wooden cabinet and to refer to the president’s cabinet. Our intuition, based on our understanding of human intelligence, is that a machine would have to “understand” these two cabinet concept to make this distinction so consistently. The human approach is understand two different concepts by learning about politics and woodworking. Machine learning doesn’t need to do this — it can look at 1,000 sentences containing the word cabinet, each labeled (by a human) as corresponding to one or the other meaning, It learns how frequently words like “wood” or “storage” or “secretary” occur nearby in each case. So it knows that when the word “wood” is present, chances are extremely high that we’re referring to a storage cabinet. But If Obama starts talking about how he’s getting into woodworking, the AI may fail completely.

Artificial intelligence can work as well as it does without “knowing’ the way humans “know” for a simple reason: machines can process far more training data than a human. Peter Norvig, Google’s head of research, most famously first highlighted this idea in a paper and talk called, “The Unreasonable Effectiveness of Data”. This is how modern machine learning works in general — it pours over massive datasets and learns to generalize in smart ways, but not in the same smart way that humans generalize. As a result, it can be brilliant and also get very confused.

So how should we we take all of this into account when we manage artificial intelligence in the real world?

1) Play to AI’s strengths: Collect more training data

Why does Facebook have such amazing facial recognition software? They have fantastic researchers, but the core reason is that they have billions of selfies. Why did Google build a better translation system than the CIA as a side project? They scraped more websites than anyone else, so they had more examples of translated documents.

AI improves more and more as it sees more and more data

Real breakthroughs in machine learning always come when there are new data sets. Deep learning isn’t much better than other algorithms on small amounts of data but it continues to improve on larger and larger data sets better than any other method.

2) Cover for AI’s weaknesses: Use human-in-the-loop

Artificial intelligence has a second advantage over human intelligence: it knows where it is having trouble. In the latest Tesla crash, the autopilot knew it was in an unusual situation and told the human repeatedly to take the wheel. Your bank does the same thing when it reads the numbers off a check. As of a few years ago, AI reads numbers off of almost all deposited checks, but checks with particularly bad handwriting still get handed off to a human for review. And more than fifteen years after Deep Blue beat Kasparov, there are still situations where humans can outplay computers at chess.

When done well, keeping a human-in-the-loop can give the best of both worlds: the power and cost savings of automation, without the sometimes unreliability of machine learning. A combined system has the power to be more reliable, since humans and computers make very different kinds of mistakes. The key to success is handing off between humans and computers in smart ways that may very well require new types of interfaces to effectively take advantage of relative strengths and weaknesses. After all, what good is a near perfect self driving car AI that hands off control to a human it has let fall asleep?

The Best Organization Tool for a Disorganized Person

I love workflowy.  I’ve used every day for years.  I think if everyone used it, the world would be a more productive, happier place.

Screen Shot 2015-06-24 at 1.20.20 PMIf you haven’t tried it, it’s basically Gmail for your to do lists.

Remember when you had to put your emails in folders so you could find them?  I really tried to keep organized folders because it because it was so painful when I had to search for an email.  Some people seem to take great joy in organizing things, but I am not one of them.  Foldering emails was my least favorite thing so I did it only in spastic fits of frustration.  I would name the folders awful things like “MSFT – Misc” or “Legal BS” that made sense to me in the moment, but never made sense again.

Then gmail came along with essentially unlimited storage and awesome search and I never had to worry about categorizing emails again.  It was so powerful that if I wanted to remember something I would email it to myself and add a bunch of keyword tags to help me find it in the future.  This made my life so much better.

When I started my company, I tried organizational tools and processes in the same haphazard way that I organized my email.  I would keep track of performance reviews in google docs or word docs in my dropbox.  I would try to track engineering todos in Jira.  I tried a million tools to keep myself focused and none of them worked.  I reverted to using a physical notebook.

But workflowy is like a notebook that’s always with you and more importantly, that you can search.  This is so powerful because even when you change your processes you can still find everything you wrote down.  In my 1:1s with employees sometimes we talk about urgent things and sometimes we talk about their career goals.  Sometimes we talk about their comp and sometimes we talk about their organizational concerns.  There’s no single good way to organize everything, because often I don’t know in advance what my employees are going to care about.  I write everything down in my workflowy in the haphazard, disorganized way that it comes at me.  I rarely refactor my notes and yet I can still find everything that someone has said to me.  With my longer tenured employees we can reflect on what their goals were in 2012 and how they’ve evolved.  I can pull up every conversation we’ve had about compensation when I do a comp review.  Most importantly, I don’t have to ask people the same questions multiple times.

I scribble notes in my workflowy about ideas I have for my blog and ideas I have for our conference and things I want to accomplish.  I write down things I want to say to customers the next time I see them and things I want to say to my mom.  Then I purge these thoughts from my mind until I see this person.  It’s amazing how freeing this feels.

I love workflowy because it doesn’t tell me how to organize things or do things but it’s made me so much more organized and effective.

Metrics and Hiring Part 2

A few years ago, I realized that I purport to run a data driven organization but I am the least data driven when it comes to my company’s most important process: hiring. I wanted to change that, so I put all the hires I made throughout my career in a spreadsheet and looked for correlations.  I graded everyone I hired on a five point scale and wrote down everything I knew at the time of hiring.  The process actually changed the way I hire pretty significantly (you can see some of the results in my earlier blog post).

I’ve become much more intuitive about my interview process rather than working through a checklist of desired skills.  I’ve also become much more aggressive about pulling people from my own network and I’ve encouraged my employees to do the same.  I’m more open to people with unusual backgrounds, but I do a lot more thorough reference checking. I think hiring is a pretty personal thing and there are lots of different ways to do it successfully, and I wondered how much my results would match someone else’s experience.  I was actually able to convince a friend, a very successful serial entrepreneur to run the same experiment I did.  He can remain anonymous, or if he’s willing I will post his name.

Anyway, here are some of his results: HiringL2 His overall distribution looks somewhat like mine, but he’s labeled more of his hires as “Superstar” and “Disaster”.  Maybe he has a riskier hiring strategy or maybe he’s just more opinionated than me🙂. Hiring2L2 He labeled schools on a 0-2 scale of general “prestigiousness” and referral strength on a 0-3 scale of how strong the referral connection was (I described the scale in my earlier post).  I found a weak correlation between prestigiousness of school and employee outcome, he actually found no correlation or a weak negative correlation. Like me he found a very strong correlation between referral strength and outcome. He also looked at some things I hadn’t thought of.  One thing he checked was how outcome changes over time.  He found that he was getting a lot better at hiring.  I like to think I’ve gotten a lot better at hiring, but now I want to go back and check. He also found a correlation between dollar compensation and success, which is interesting.  I think I would find a negative correlation – I believe that I’ve found executives harder to hire in general, although after recent adjustments I think I’ve gotten better at it. He noticed a weak positive correlation between a competitive hiring process and success. What’s the big takeaway here?  The results are interesting, but they’re probably highly personal.  Anyone who does a significant amount of hiring should really spend an hour and run the data on themselves.  It’s probably the single best thing you could do with an hour of your time.  If you want to share it with me, I’m happy to aggregate it, anonymize it and post it.

Why Do Certain Musical Notes Sound “Good” Together

This was originally a response to a question on Quora.

Two notes sounding “good” together sounds like a very subjective statement.  The songs we like and the sounds we like are incredibly dependent on our culture, personality, mood, etc.

But there is something that feels fundamentally different about certain pairs of notes that sound “good” together.  All over the world humans have independently chosen to put the same intervals between notes in their music.  The feeling of harmony we get when we hear the notes C and G together and the feeling of disharmony we get when we hear C and G flat together turns out to be part of the universal human experience.

Instead of from subjective notions of “good” and “bad”, scientists call the feeling of harmony “consonance” and the feeling of disharmony “dissonance”.  Some cultures and genes of music use a lot more dissonance, but most humans perceive the same relative amounts of dissonance between pairs of notes.

The most consonant pairs of sounds are two sounds that are perceived as having the same “pitch” .  In other words, the G key below middle C on my piano is so consonant with the G string on my guitar that they are said to be the same note.

Here is a recording of one second of me playing the G-string on my guitar.  This graph shows the waveformof the sound, which is really just a rapid series of changes in the air pressure.  Hidden within this waveform are patterns that our ears and brain perceive.

These waves then cause little hairs in our ears, called stereocillia, to vibrate, with different hairs vibrating at different frequencies. We perceive this sound through stereocillia in our ear that vibrate at different frequencies  You can think of sound as the sum of different frequencies of vibrations and the hairs on our ears extract the amount of each frequency contained in the sound.  We can also use math to extract the frequencies contained in the sound as I did below with something called a Fourier Transform.

We commonly think of a pitch, like a G,  sometimes think of pitches as having a single “frequency” but sounds the graph shows that it’s are actually composed of various amounts of many different frequencies.  In this case the lowest frequency of the string is 196Hz or 196 vibrations per second, but the string is also vibrating at double, triple, 4x times that.  The lowest frequency is called the fundamental frequency.  These higher frequencies are called overtones also known as harmonics when they are at simple multiples of the fundamental frequency.  Instruments with vibrating strings like my guitar tend to vibrate at multiple frequencies where each frequency is a multiple of the lowest frequency – this is related to the physics of a string and it will be really important here.

Here is a one-second recording of me singing along to the G-string.

This audio waveform looks pretty different from the recording of my guitar, but when we look at the frequencies we can see that the two match up.

I added red dots to this frequency graph to highlight where the harmonic frequencies are and show the uniform spacing.  Each dot is exactly 196Hz apart just like in the graph of the guitar’s frequencies.

The lowest or fundamental frequency of the recording of my voice matches the 196Hz of my guitar string shown on the previous graph.  It’s amazing that we are able to make our voices harmonize so exactly without even thinking about it.

When I sing the G note along with my guitar my voice and my instrument are causing the same hairs in my ear to vibrate.

The fact that the frequency peaks or red dots are even spaced is a physical property of our vocal chords and comes from the fact that our vocal chords are essentially a long tube of air.  Other instruments that are like longs tubes of air have the same property such as flutes, saxophones, horns and harmonicas.

When I play my guitar an octave higher I can make a harmony.  A one second recording looks like this – again totally different from the previous two.

But when I look at the frequencies in its composition, they are exactly double the the frequencies of the low G string or me singing the low G.  The red dots show the spikes from our earlier low G graph, the yellow dots are the frequency spikes from the high G sound.

So when you go an octave up, the same hairs will vibrate as with the lower octave, although not all of them. That’s what gives us the senseof two “notes” being the same even when they’re an octave apart.

Almost every culture that has a notion of an octavealso has a notion of a “fifth” or note halfway between an octave.  Two notes that are a fifth apart are the most consonant of any two notes that are not the same.

The G note is the “fifth” of a C note.  In western music, all of the most common chords with a C root have a G note in them.   Why does a C and a G fit so well together?  Here are the frequencies of playinga C on my guitar.

You can see in red the harmonics (or frequency spikes) of my G note and in yellow the harmonics of my C note.  They don’t always line up but because my C note’s fundamental frequency (need to define this)  is 3/2 of my G note they line up every 3rd harmonic of the C and every 2nd harmonic for the G.

The two notes that sound most consonant with a C are F and G, corresponding to the “perfect fourth” and “perfect fifth” intervalsfrom C.  Why do they line up so well?  We can look at how many of the harmonics line up.

You can see that G and F harmonics line up quite frequently with C’s harmonics at the bottom.  But notice that G and F’s harmonics don’t line up with each other very frequently.  So G and C sound very consonant and F and C sound very consonant but G and F sound much more dissonant.  This is why it’s very common to play G and C together or F and C together but it’s unusual less common to play a C, G and F all at once.

All of the notes that are consonant with C have intervals with many harmonics overlapping as you can see on this bigger chart.

You can see here that C and E have lots of overlapping harmonics – C, E and G would be a C major chord.  C and D# have almost as many overlapping harmonics and C, D# and G would be a C minor chord.

Some notes don’t correspond to any simple fractional interval, and those notes sound very dissonant.  For example, playing C and F# together is extremely dissonant because there are no overlapping harmonics (the F# doesn’t quite even line up with 2/5 interval – for more on this see my answer to Why are there 12 notes?).

Some instruments don’t produce these overtones at simple multiples of the fundamental frequency.  Drums usually don’t produce simple overtones because the vibrations travel across them in more than one dimension, which creates more complicated patterns.  This is why you can’t typically hear drums harmonizing with each other even though they have a recognizable pitch.

We can stop there if we want to, but there are other psycho-acoustic effects that affect consonance vs. dissonance.  One effect worth mentioning is the dissonance we here when two frequencies are close but not overlapping.  .

When two notes are played close together the waveforms look roughly like this:

When we extend out the waveforms we can see that they move in and out phase.

Our ear hears the sum of the blue and the orange waveform which looks like this.

Or looking at a longer time period:

When the wave forms are in sync at the beginning they amplify each other, but as they get out of phase they subtract from each other.  This creats theabeating sound that is very recognizabe if you’ve ever heard an out of tune piano or an out of tune guitar.

To western ears this sounds like an out of tune instrument.  Some cultures incorporate this sound into their music.  It’s pretty clear that this is an effect associated with dissonance.  As other people have mentioned in their answers, two pure sounds with frequencies that are within a note or two are universally heard as dissonant.