Why I Am Changing My Mind About AI Risk

The following is pretty unpolished, and except for some minor edits was written a long time before I published it. It might not be representative of my current thoughts in multiple ways. Nonetheless I think publishing it will be more honest and informative than not publishing it.

I have long been an observer of Less Wrong and the rationality movement. When I first read Eliezer Yudkowsky’s Sequences (around 2010) I was very entertained by them for being well-written texts which, as I thought, got a surprising amount of difficult questions right. Of course, there were many points where I disagreed with him, and still do. When I read his futurist articles, they made sense, but I was skeptical. Their point of view was weird, but there was a sense I found that appealing — I had already thought it likely that a truely rational look at our world be equally weird, so I saw it fitting in some sense with the rest of the Sequences’ rationalist messages. Arguments, that were more popular during the early history of Less Wrong, that even with a small chance of success donating to SIAI (as MIRI was then called) had an enourmous expected value did not convince me, but would occasionally frighten me when I thought about it.

I decided to hold off before making decisions, think about things some more, and definitely don’t give them money while there’s a large chance they’re looneys. So I gave myself time to think about the arguments for or against. Eventually (I believe around 2012 or 2011) I decided that AI risk and its proponents really is ridiculous.

I am now in the process of changing my mind about this. Here are three reasons why:

  1. I am amazed by the past progress of AI in the past few years using convolutional and recurrent neural networks. In particular, the observation that underlying the large variety of recent achievements is a relatively small set of relatively simple ideas suggests to me that there really is an underlying method behind “general-purpose” and that we have found a piece of it. Whereas beforehand I was unconvinced that any line of research could be known to have relevance on future general AI, I see this as a possible counterexample.

  2. My opinion on the issue is influenced by my impression of what other people think about it. I try to be open and unrepentant about this — I believe that learning from the collective opinions of thers is rational. AI risk did not look good. It was mostly only discussed seriously in a single community over a small number of websites. While the advocates claimed they are doing a technical research program, they were to a large extent unconnected to academia, and were fundraising to the public rather than institutions qualified to judge them on a technical basis, which is suspicious.

    This is changing. Although AI risk is still not mainstream, it has gotten much bigger than it used to be. I now believe that even if I hadn’t initially gotten into Less Wrong when I did I still would have been exposed to these ideas. And now that there are more players in the game I can better mentally seperate the questions of whether MIRI is any good and whether AI risk as a whole is important.

    In retrospect it seems like I did not react to this factor as quickly as I could have.

  3. Over the past two years, I have undergone a period of psychological hardship, and I worry it had affected my cognition. In particular, it would have increased my positive affect to the rationalist community. Optimistically, this let me look at the issues in a new light unbiased by my prior preconceptions. Pessimistically, in a moment of weakness I have let new opinions enter without examining them with due diligence, and a subtle flaw lies hidden. I imagine another person in my situation might become religious.

Of these reasons, the most worrying is the first one — it says we might not have much time. The third one is also worrying to me on a personal level.

In the short term, here are some things that I think of doing:

  1. I still disagree in significant ways with many of the positions current advocates have on the details of AI risk. I hope I’ll be able to write up my thoughts on this matter, and that people will give it enough attention either to be convinced or to take the time to convince me otherwise. Currently the writer I’ve seen whose opinion most resembles mine is Paul Christiano.

  2. Seeing as I’m still not sure, and may never will be, about AI risk, I intend to be on the lookout for any reason to change my mind again. Unfortunately if I start thinking of myself as seriously committed to AI risk changing my mind will be more difficult.

Edit (2017-01-04): Changed title from “Why I Changed My Mind About AI Risk”. This is the title I was intending and don’t know why I ended up writing “changed” instead of “am changing”.


Suppose an alien came down to earth, and described an operation it can perform on you. This operation will make a lot stronger and tougher physically. It will also make you a lot smarter, more conscientious, and can even make you a more moral person. Inexplicably, if you get this operation people will take you more seriously and are more likely to believe anything you say. And all of these effects are huge.

There’s one catch: Anyone who undergoes this operation becomes completely obsessed with cleaning pottery jugs. Seriously, it doesn’t matter how boring you consider this now, it’ll be what you spend all of your time on. If you have anything else you care about you’ll still remember that, but it’ll take second place to cleaning pottery jugs.

Now, you’re expecting the way this hypothetical will continue, you’ll be asked what you think about this operation, and whether you want the alien to perform it on you or not. Yeah right! As if the alien gives a fuck what you think! In fact it has already done this operation. Lucky it takes a while for it to take effect, so you still have some time to be your old self for a few years. Good luck!

A Theory Concerning the Foundation of the United States

[Epistemic status: Crackpot]

This post is a about some observations I’ve been developing for quite a while. I was inspired to post about it by a recent chapter of the new web serial Unsong, which touches upon similar themes. Some details have only been worked out now when I’m writing this up.

The founding legend of Rome, as recounted by Virgil’s Aeneid, was that the Romans were descendants of Trojans. Troy itself a great empire, when it was sacked by the Greeks the Trojan Aeneas was ordered by the gods to found a new Troy in the province of Italy. So he wandered the sea for many years until he finally reaches Rome. Now, there’s a gap of a few hundred years between the accepted dates of the fall of Troy and the founding of Rome; Aeneas did not found Rome right away. Rather, he founded the city of Alba Longa, which his descendants ruled until Romulus and Remus, who founded Rome.

It is certainly not a new idea to compare the United States with Rome. Both are powerful empires that center their self-identity with their democratic institutions. However, I believe the connection between the two is much deeper than that…

People commonly place the fall of the Roman Empire during the fifth century CE. This is not quite right. See, the Western Roman Empire fell in 476, but the Eastern Empire, now known as the Byzantine Empire, persisted a long time after that. In fact, the Eastern Roman Empire only fell in the year 1453.

In 1492, merely 39 years later, Christopher Columbus discovered the continent of America. Most historians believe that he was from the Republic of Genoa, which like Rome is a republic in Italy. However, an even stronger connection can potentially be made. Some people believe that Christopher Columbus was of a Byzantine origin, and may even have been related to Byzantine nobility. This is especially significant if it is possible to trace a lineage from Aeneas himself.

Before Christopher Columbus, Leif Erikson independently discovered and explored North America, and the Norse eventually named the region Vinland, due to the grapevines that grew there. The name Oenotria appears in some ancient sources, including three times in the Aeneid, to refer to southern Italy. The name comes from Greek οἶνος “wine”, since the area was rich in vineyards.

The parallel between Christopher Columbus’s journey and the journeys of Aeneas and Odysseus is obvious. Notice, too, that like how Aeneas does not immediately found Rome, but rather founds Alba Longa which centuries later produces Rome, so too Christopher Columbus is only responsible for exploring the continent wherein centuries later George Washington founds the republican empire.

From the strength of these analogies I can only conclude one thing: That the same events have occured twice, and Columbus was divinely inspired to explore America and found a new Rome like Aeneas was commanded to found a new Troy.

Although originally I only thought to make Graeco-Roman connections, inspired by Unsong it’s worth looking a bit into the Judaeo-Christian relationships. The obvious analogy is Moses, who wandered the desert fourty years seeking to found the new nation of Israel. One interesting contrast is that although both Odysseus and Aeneas reach their intended destination at the end, Christopher Columbus sought to reach India but never arrived there, like how Moses never set foot in the land of Israel. A different interpretation for the fact that Christopher Columbus failed to reach India is that although India was where he desired to reach, the divinely-fated target for his journey was America, like how Aeneas wanted to stay in Carthage with Dido but was fated to found a nation in Italy.

An Idea for Improving Hashlife

(This was written with the priority of making sure my thoughts don’t just stay in my head forever over explaining anything well. Except some parts to be cryptic or badly phrased.)

Hashlife is currently the best algorithm for simulating large structured patterns in Conway’s game of life and other cellular automata for long periods of time. It is described here. Basicly, it memoized recursive algorithm for computing the evolution of a 2^n \times 2^n block.

Memoization means that whenever the algorithm encounters a 2^n \times 2^n block that it has seen before it would instantly be able to use the answer it previously computed. This is what gives the algorithm its power. On the other hand, the algorithm can only detect this if the two configurations are aligned exactly the same on the 2^n \times 2^n blocks which it divides the grid into. In other words, it doesn’t take full advantage of translational symmetry, but only takes advantage of it when it’s a translation by a multiple of the block size[0]. Due to the way Hashlife calculates many time-steps of a pattern at once there is a similar alignment problem in time.

For example, the Caterpillar is a humongous spaceship that move forward 17 tiles every 45 steps. It has a lot of repeating components, but they all move in this speed, so they are rarely in the same alignment. Here Hashlife runs really slowly.

So I’ve been thinking about how to make a better version of Hashlife which doesn’t have these constraints. Then the problem is to recognize a pattern if it was seen previously with a different alignment. The first idea I eventually came up with is to use what I call a translation-invariant hash. If you take this hash on two blocks of tiles that almost completely overlap, this function should return the same or a similar value. Clearly this is not a good hash function in the conventional point of view, but it is very useful here: If you make a hash table based on a translation-invariant hash, then a lookup for a block B could also return a block B’ which contains a translation of B. This means you can find that the same pattern was calculated already even if it is out of alignment.

Here is a simple example of a translation-invariant hash: Let H be an ordinary hash function on 8×8 blocks. For some large block B, one can define H_T (B) to be the sum of H (X) for every 8×8 X that is contained in B. Then a translated block will only differ in terms of the hashes on the boundary, which on a large block will be a minority. By truncating the last digits of this you get a hash that’s completely identical for most small translations.

Now, one problem that can come up is: now that we found two blocks that are approximately translates, how do tell by how much one is a translate of the other? In this case there is an easy method. Alongside the function H_T, one can also calculate two other functions H_{T X}, H_{T Y}, such that H_{T X} (B) (respectively, H_{T Y} (B)) is the sum of x H (C) (resp. y H (C)) where C is an 8×8 block contained in B whose northwest corner has coordinates (x, y) (Here 8 is an arbitrarily chosen number, in this case because it’s a small power of 2). Then if B and B' satisfy H_T (B) \sim H_T (B') and they really are close translates, the position of B' relative to B would be approximately

(\frac {H_{T X} (B') - H_{T X} (B)} {H_T (B)}, \frac {H_{T Y} (B') - H_{T Y} (B)} {H_T (B)})

then the data structure for a block B will store along with H_T (B) these “integral hashes” H_{T X} (B), H_{T Y} (B).

I will not discuss how to take advantage of the overlapping blocks found this way to speed up the computation.of the cellular automaton.

This in itself may already be an improvement (I haven’t written any code so I can’t benchmark this), but H_T has some weaknesses. The problem is that it is way too loose. It produces a collision for two overlapping blocks, but it also produces a hash collision in loads of other situations. For instance, it produces an almost identical value for the empty block and an almost empty block except for a small object. These are closer to each other than most of the combinations of overlapping blocks, which are the things what are supposed to collide. Worse, if there are two small objects on an otherwise empty block which are far away from each other, then H_T returns an exactly identical hash. If you want any algorithm based on this hash function to work, it is necessary to check a block found by the hash table to verify it actually overlaps. This adds to the computation time.

The problem is that the hash function is too local: it only depends on the properties of a random 8×8 region in a block. Perhaps a better idea would be to use larger subregions, for instance, sum the hashes of \sqrt{N} \times \sqrt{N} subregions when the block is N \times N. However, this would take too long to compute (asymptotically O (N^3), around the same it would take to calculate the evolution of the pattern directly for O (N) steps). Instead, it would be better to look at the hashes of only some of the subregions, which are determined in a translation-invariant way. Here is my second idea: define an i-focal point as follows:

  • Every point is 0-focal.
  • The i-hash of an i-focal point is the (ordinary) hash of the 2^{i+3} \times 2^{i+3} rectangle which has that point as its southwest corner. This rectangle is called the region associated with that point.
  • An i+1-focal point is an i-focal point whose i-hash is greater the i-hash of all i-focal points up to 2^{i+3} tiles south and up to 2^{i+3} tiles east of it.


Then considering only the i-hashes of i-focal points is translation-invariant and feasible to compute.

However, once we have these i-focal points there’s something even better we can do. Remember that the goal of the whole translation-invariant hash was so that we’d be able to recognize a pattern we’ve already encountered even when it’s translated. However, these i-focal points and their corresponding hashes do the job even better: The same region will have the same i-focal points no matter how it is translated, and no coarse-graining is necessary from any averaging process. So it is a good idea to make a hash table for caching all the regions associated i-focal points to recognize translates, and not at all using the original averaging idea with translation-invariant hashes. However, I only came up with that simplification while I was writing this up I decided to still include the original idea. I know that makes this description pretty messy.

All this doesn’t mention time. The original Hashlife also has a feature where it evaluates blocks for 2^n generations at once. This causes temporal alignment problems similar to the spatial alignment problems I’ve already discussed. I expect pretty much the same solutions to work here. These are really just general ideas for recognizing patterns in n-dimensional space and should still work when time is added as a coordinate.

[0] Actually, the translation only needs to a multiple of half the block size, due to how Hashlife calculates the areas between the blocks.

This blog now has a title

Note: Woops, made this a “page” rather than an ordinary post. Changed it.

I remember the early days of this blog. A lot of things have changed since then. Back then, it was just me writing things and nobody was reading it. Right now there probably still isn’t anybody reading this (not even you). I guess some things never change.

Indeed, this blog started from nothing and almost quadrupled in size since. And in this great growth story, today is an important landmark. That is because today, this blog has a title. It also now has a tagline.