Hey there PaleoPals. Today’s article is a bit unorthodox, as it’s actually a quick summary of an experiment that I did with a friend for one of my graduate classes recently. If you’re interested in engineering, statistics, testing, and scientific journals, you should enjoy it. If none of those things interest you, I promise there will be a prize at the end if you read all the way through.
In engineering, there exists a field of study known as Systems Engineering, which deals with managing extremely complicated systems (like a rocket or jet fighter) which comprise many different disciplines, and producing a product with the ease and efficiency of building a Lego set.
Inside of Systems Engineering, there is a study known as Design of Experiments (DoE), which deals with how to effectively “design an experiment” (engineers aren’t very creative, lexicologically. Yes that’s a word. I just made it up.) DoE is an extremely important tool for complex projects where many thousands of things may need to be tested at once. You want to design the test such that the important results are readily apparent, so that we don’t have to dig too far through the data to give us the answer we need, and DoE allows us to do that.
So, in my DoE course for my Systems Engineering Master’s degree, I decided to design an experiment to test the 3G networks of AT&T and Verizon. The results may surprise you…
So as I’m sure you’ve all heard by now, Apple has released their newest magical device into the world, the iPhone 4. I won’t lie to you, this phone turns me on. Like, physically… The clean lines and smart use of metal and glass (Update: Ok, maybe not so smart) say to me, “This was designed as a tool, not a toy.”
Unfortunately, this glorious piece of techno-porn is only available on one carrier, AT&T. Luckily, Verizon offers a variety of smartphones that are close-to, but not equal-to, the iPhone 4. The Droid Incredible is probably your best bet, and while it lacks a forward-facing camera, it does have voice-to-text for emails and text messages, and microSD expandable storage. It’s also pretty easy on the eyes.
I don’t do many voice calls on my AT&T account anymore. So for me, the main factor that would determine whether or not I would buy a new iPhone and stay on AT&T, or get a Droid Incredible and switch to Verizon, is the quality of the 3G Network. So how can I test this? What factors affect the quality of a particular network provider’s 3G data network? Here’s a quick list of the factors that we controlled, blocked, randomized, or ignored in our experiment.
- Network Technology
- Local Obstacles (tree’s, buildings)
- Local Installed Capacity relative to Subscriber Base
- Phone Hardware
- Phone Software/State
- Instantaneous Network Traffic
Network Technology is obviously what we were testing for; AT&T and Verizon have different 3G networks, and the characteristics of those networks are defined by the technology that creates them.
Location and Local Obstacles were controlled by finding a location with both an AT&T and a Verizon tower, side-by-side, with a clear line of sight. Check it out.
Local Installed Capacity relative to Subscriber Base was assumed to be the same between Verizon and AT&T. This means that we assumed that both AT&T and Verizon would have similar acceptable levels of service quality for a given amount of subscribers. For instance, if both towers could handle 1000 users, and the average daily usage was 950 users, then both companies would think about investing in a new tower. This, quite clearly, is a TERRIBLE assumption, but there wasn’t really much we could do about it, since neither company is very forthcoming on how shitty they’ll allow their service to become before upgrading…
Phone Hardware was semi-controlled, because we used an iPhone 3GS for the AT&T phone, and a Samsung Droid for the Verizon phones. Both are top-of-the-line phones, and both are fully capable of saturating a 3G connection without crashing or exploding.
Phone Software was controlled by using the same app to measure the response variables for both phones. It’s the “Speedtest.net” app from trusted company Ookla Net Metrics, who hosts http://www.speedtest.net, which is usually considered to be the best network quality test site.
Instantaneous Network Traffic can neither be controlled nor observed. It is considered to be a “nuisance variable”, and must be attempted to be randomized against. However, due to time constraints, we couldn’t truly randomize our measurements, as it would force us to take single data points on random days at random times, without stacking, and would have taken YEARS, by which time the 7G networks would be rolling out, and our results would be useless anyway…
SO, we measured Ping, Download Speed, and Upload Speed on different days at different times, always both phones recording data simultaneously, and here is a quick overview of our results. You said you liked statistics right? Good.
You might need to click-to-enlarge in order to see what’s going on here… That’s What She Said.
These are Box Plots of our data points. Box plots are somewhat qualitative, but they do a great job of showing the spread of a particular data set with respect to another.
First let’s look at the Box Plot for Upload speed. AT&T shows a pretty consistent speed of around 250 kbit/sec, no matter what day or time. Verizon shows a somewhat-less-consistent speed of around 800 kbit/sec most of the time, but that Weekend-Midday dataset looks like it might have something screwy. (I’ll skip a step and tell you it was an outlier, and the plot looks much better when it’s removed) But even with that screwiness, it’s pretty clear that Verizon has a faster average Upload Speed than AT&T. So rather than continue on with the Statistical Analysis, we will stop here and declare victory for Verizon.
Same sort of story for Ping; Verizon is the clear winner. But holy crap, look at that variance in the AT&T data! Let me tell you, when we were recording the data points, we were confused as hell. We’d get data points with pings of 2500ms (2.5 seconds?? Unacceptable!!) followed by another data point with a ping of 300 ms. Needless to say, Verizon kicks ass with a much better (and more consistent) ping.
Now onto Download Speed. Hmm… Things are getting interesting.
If you haven’t taken a Statistics course, you might be inclined to just take the average of the AT&T data and compare it to the average of the Verizon data. But if you had read Patrick’s article on Randomness, you would understand that a simple average is not indicative of a true population set. There is a LOT of random noise in that data, and if we had continued to take data points, the Box Plot would be changing shape and position with each new piece of data. What we are looking for is a STATISTICALLY SIGNIFICANT difference between AT&T and Verizon, meaning a result that would be extremely unlikely to be created by chance.
* IMPORTANT SIDE NOTE *
There are only two things you need to know about Statistics. 1) Randomness (which Patrick covered) and 2) Statistical Significance (Which I will cover in more detail in the future, and am touching on here)
* END SIDE NOTE*
So how do we deal with all the noise in our data? Well I don’t want to bore you, so let me just say that we cast a magic spell on the data set that attempts to “partition” the variance in the data into separate buckets, each bucket representing a different factor that we’ve controlled or accounted for. Whatever is left over is due to the random noise. If the effect of the random noise is very small, then our magic spell worked, and we can now make conclusions about the data. If not, we can’t do very much.
The magic spell goes like this, “Teenagers with cell phones annoy their parents; show me the Analysis of Variance!” (also known as ANOVA)
And here are our results.
Unless you’re a Mathematician, Systems Engineer, or Statistician, I don’t expect these figures to make any sense to you… I was just trying to impress you. Did it work?
(and if you are one of those 3 people, this was a 2^3 non-randomized, blocked factorial experiment, in case you’re wondering)
What’s important here are the values under the “P” column. “P” is known as the POWER value, and it tells us how likely it is that the variance seen in the data is caused by that particular factor. The P Value for Network (AT&T vs. Verizon) shows a high value, indicating a 73.6% chance that the variance in the data was caused by the different networks. However, we’re SCIENTISTS dammit, and we don’t accept anything under 95% confidence! (No really, we don’t. Unless you’re a doctor testing a new drug, then 60% is acceptable.)
Blah blah blah, what’s this all mean? Basically our analysis is telling us that we can’t make a strong conclusion about whether or not AT&T’s 3G Download Speed is faster than Verizon’s, or vice versa. WELL CRAP! Download is the most important response variable for most people! I dont give a shit about Upload, and I barely care about Ping!
In scientific experiments we classify a result like this as “Failing to reject the null hypothesis,” and it’s unfairly frowned upon in scientific journals and universities. Our null hypothesis (which we were trying to disprove) was, “there is no difference between Verizon and AT&T’s average 3G network download speed.” Based on the data we collected, we could not justify a rejection of the null hypothesis. In practice, this means that a study like this could never be published in an academic journal, because journals feel that results which don’t disprove something aren’t adding new value to the field of study.
But look at our data! We showed that Verizon has a much tighter variance in Download Speed than AT&T (not to mention their stellar upload speeds and pings). Isn’t that adding value to the field of study? On top of this, the mere observation that the two populations have different variances is enough to say that they are indeed different. Does this difference make one network “better” than the other? That’s up to YOU, and YOU deserve to see this information. And no, I’m not just talking about cell phone providers now…think of all the vital information we could be missing that’s simply hidden inside the un-rejected null hypotheses of various studies around the world! I encourage all our readers who are involved in a scientific field to be unafraid to report a result which fails to reject the null hypothesis, and instead learn to make your quote-unquote “negative” results more “sexy” and appealing.
So in conclusion… the iPhone turns me on, AT&T sucks, Statistics are confusing, and scientific journals are staffed by a bunch of crotchety old men who lack excitement in their lives.
I’m Out. PEACE!