## Question

How much storage space do you need to store all the information from Google maps?

## Answer Structure

In estimation questions, interviewers are evaluating your problem-solving and quantitative skills. They are looking for a ballpark number, not an accurate number. What matters is whether you are logical, can explain all your assumptions clearly, are organized in your work, and good with numbers.

The book *Cracking the PM Interview* by Gayle Laakmann McDowell and Jackie Bavaro provides a good suggestion on how to structure an answer for this type of question. It can be summarized as:

**Ask clarifying questions**. This will eliminate any ambiguity of what should be included in your calculations.**Make an equation**. Consider edge cases or alternate sources of data and write down any facts that you know will help with calculations.**Break down the equation into components**. Write down any assumptions next to the components so you don’t forget to explain them clearly to the interviewer.**Do the math**. Calculate the result of each component and compute the result.**Do a sanity check**. Do your results make sense? If not, recheck your equation, assumptions and arithmetic.

## Answer Example

**INTERVIEWEE**: Is this estimation for the US alone or the world?

**INTERVIEWER**: For the world.

**INTERVIEWEE**: Okay. Based on my experience with Google maps, it provides information about:

- Street names
- Homes
- Public buildings
- Businesses
- Traffic
- Non-urban areas, like mountainous terrains

**INTERVIEWER**: Okay, good, but why are you differentiating between homes, public buildings and businesses?

**INTERVIEWEE**: I have seen that Google maps displays more photos for buildings and business locations than for homes. Therefore, public buildings and businesses will require more storage than homes and I need to treat them separately.

**INTERVIEWER**: ok, agreed.

**INTERVIEWEE**: Here is how I would estimate storage for each of these categories.

Street names storage:

[# cities] X [# streets/city] X [storage for street name]

Homes storage:

[# cities] X [# homes/city] X [one photo storage + text storage]

Public buildings storage:

[# cities] X [# public buildings/city] X [10 X one photo storage + text storage]

Businesses storage:

[# cities] X [# businesses/city] X [10 X one photo storage + text storage]

Traffic storage:

[# streets] X [storage/minute] X [# minutes/year]

Non-urban areas:

[surface of the earth in square miles] X [25%] X [storage/100 sq miles]

Total storage

= [Street names storage]

- [Homes storage]
- [Public buildings storage]
- [Businesses storage]
- [Traffic storage]
- [Non-urban areas]

To estimate the storage required for street names, homes, public buildings and businesses, I will differentiate between large and small cities, because big cities have more of these types of locations than small cities.

For traffic storage estimation, I will assume that traffic information is kept only for a year.

Okay, let me start with the big city calculations:

#### Big city streets calculation

I will use New York city as a proxy for a big city. It has about 1000 streets, and assuming it takes 100 KB to store the street name and other metadata, like its distance, then it takes 100 MB for a big city to store street information. I am not including photos of the street, because I will account for the street photos in the homes, public buildings and businesses storage calculations.

#### Big city homes calculation

For the big city homes calculation, New York city has about 9M people, and assuming there are 3 people per home, that gives 1M homes. Then assuming 4 MB for a photo of the street view and 100 KB for the home location information, and multiplying by 1M homes results in about 12 TB.

#### Big city public buildings calculation

Assuming there are about 10,000 public buildings in a big city, multiplying that by (40 MB + 100 KB) which is storage for photos and text, results in 400 GB. I am assuming that there are about 4x as many photos for buildings than for homes.

#### Big city businesses calculation

Assuming there are about 20,000 businesses in a big city, multiplying that by (40 MB + 100 KB) for photo and text storage results in 800 GB.

Let’s move to small city calculations:

#### Small city streets calculation

I will assume that the number of a streets in a small city is about 1/4 of the number of streets in a big city, so 1,000 / 4 = 250. Multiplying this number of streets by 100 KB to store the street metadata results in 25 MB.

#### Small city homes calculation

I will assume that an average of 300,000 people live in a small city. Divide by 3 persons per home, gives 100,000 homes. And multiplying that by (4 MB + 100 KB) for photo and location storage, results in 410 GB.

#### Small city public buildings calculation

I will assume the number of public buildings is 1/10 of the number of buildings in a big city, so 10,000 / 10 = 1,000 buildings. Multiplying that by (40 MB + 100 KB) of storage for photos and text, results in 40 GB.

#### Small city businesses calculation

I will assume the number of businesses is 1/4 of the number in a big city, so 20,000 / 4 = 5,000. Multiplying that by (40 MB + 100 KB) of storage for photos and text results in 200 GB.

With these results, I can now calculate the total storage for one big city as 100 MB + 12 TB + 400 GB + 800 GB = 13 TB. And, the total storage for a small city is 25 MB + 410 GB + 40 GB + 200 GB = 650 GB. Now we need to multiply these numbers by the number of big cities and small cities. I know there are about 1,000 big cities in the world and I am going to guess that there are about 4,000 small cities, so the total storage for city information in Google maps is: 1,000 X 13 TB + 4,000 X 650 GB, which is about 15 PB.

Let’s calculate traffic data stored in Google maps. I will assume that traffic data is only kept for one day. Under this assumption, I will estimate storage by multiplying the number of streets in the world times storage needed to record traffic per minute times minutes in a day. To calculate the number of streets, I will first estimate the number of streets in big cities and then small cities. I know there are about 1000 big cities in the world and I will assume that a big city, like New York City, has 1000 streets, so 2,000 x 1,000 results in 2M streets for big cities. Assuming there are 4,000 small cities in the world, and that each small city has about 250 streets, then there are 4,000 X 250 = 1M streets for all small cities. Then multiplying the number of streets by the storage per minute and minutes in a day gives (1M + 1M ) X 10 MB / minute X 1,440 minutes / day, resulting in 28 PB. I am assuming here that it takes 10 MB per minute to record traffic photos per street.

Now let’s calculate non-urban areas. I will use the surface area of the earth to do this. The surface area of the earth is about 4 x π x radius², where the radius of the earth is about 4000 miles. Assuming that 25% of the earth surface is solid, and to store metadata takes 400 MB per 100 sq. miles, the equation becomes 4 x 4 x (4,000 miles)^2 X ¼ X 400 MB / 100 sq miles = 256 TB.

Finally, adding storage for cities, traffic, and non-urban areas results in 15 PB + 28 PB + 256 TB = 43 PB. Google holds about 15 EB per year, so I think that 43 PB for Google Maps is not unreasonable.