I have just finished the last day of my internship here at MIT. Even though I am leaving, I am certainly not done. I spent a lot of time this week preparing my codes and data for me to work on them remotely. I have enough for a solid presentation now, but adding even more correlations makes things clearer and better, so I will continue to run more data between now and AGU.
Looking back, this internship has been amazing. I learned a lot about seismology, computers, research, and life this summer, and it has made a big impact on how I think about my future. I have known for a long time that science is the path for me, but was previously somewhat unsure about whether I would do well in a research career. I was also nervous about being totally on my own, because I had never lived anywhere but Lincoln, Nebraska before. After this summer, though, both of those concerns have been greatly alleviated; I definitely want to go to graduate school and work toward a career in geophysics. My confidence in this path has been building slowly all summer, but it really hit home when the postdoc with whom I have been working told me that he has been very impressed by me this summer and thinks I have a bright future in geophysics. I am excited to return to Nebraska and see my home, family, and friends again soon, but I will definitely miss it here; MIT is a great place for a nerd like me to be!
Only one week is remaining for me at MIT! I have been trying my hardest this week to begin wrapping things up for my departure. I set up my remote access to the computer that I have been using here, so I can keep using our 64-core cluster to correlate and stack more data even after I return to Nebraska. I will also save much of the completed data and the codes I have written on a flash drive so that I can create figures and other things for my poster later. I picked a “winner” out of the procedures that I have been testing, so I will continue to process more data with (drumroll, please) the amplitude-based sorting system, instead of with the comparisons to the earthquake catalog. My group and I are still a little confused as to why sorting by amplitude works better; I think it might be due to non-earthquake events that cause high-amplitude data, such as storms.
After much editing, re-editing, and re-re-editing, I finally called it good on my abstract for the AGU meeting and clicked the “submit” button. As I have been learning more throughout the summer, I have been growing more and more excited for AGU. The meeting is still four long, homework-filled months away, and I will probably grow yet more excited as it draws nearer.
I have also been rushing to get as much of a fun Boston experience as possible before I leave. Last weekend, I sat in a Fenway Park seat (I did not actually attend a Red Sox game; the seat was an old one, taken out during a renovation, that a pizza place bought to put inside of their Sox-themed restaurant), saw an early typeset printing of the Declaration of Independence at the Museum of Fine Arts, and browsed the fancy boutiques of Newbury Street.
The body waves are revealing themselves! Among the sets of correlations I have been carefully creating and inspecting, a small but coherent wave has shown up before the surface wave in my plots. I am overjoyed by their appearance, because it happened just in time for me to write about it in my AGU abstract, which is due next Wednesday. I was a bit worried about being too speculative, but now that the waves are actually visible, I am more confident in what I can express in the abstract. Despite my confidence, I am still new to abstract-writing in general and thus not stellar at it, but the more experienced members of my group have been terrific about helping me with editing.
Now that I have found a wave, I need to figure out exactly what it is. It appears to be an S wave, but there are several phases of S wave it could be, like Sn or SmS, that have moved through different parts of the earth before arriving back at the surface. Luckily, I have SEIZMO’s TauP function, which returns information such arrival time and speed of the wave when given a source/receiver distance. I will use TauP to compare the known behavior of various phases with what I observe in my plots, and hopefully be able to pick out the phases I have been seeing.
In an attempt to improve the clarity of the waves in my plots, I am correlating and stacking more of my data to add to it. I am also trying another stacking method: instead of removing the time windows after earthquakes, I will try keeping only those windows. Earthquake coda, the time after an earthquake occurs, can show different phases with different levels of visibility than plain ambient noise, so it is worth looking into.
Another factor I am interested in investigating is the effect of the region of origin on the appearance of the waves. Previous recovery of body wave phases has mostly been from areas with relatively simple geologic structure like cratons, but the United States has a wide variety of geology across it. I plan on sorting apart my data by the stations’ locations so that I can see if waves are easier or harder to find in different parts of the country, like the great plains or the rocky mountains.
I got to have a fun break from research on Thursday evening by attending a Drum Corps International show in Quincy, a southern suburb of Boston. DCI is like major league marching band; the groups have rigorous auditions, rehearse practically nonstop, and go on a performance and competition tour through the summer. To a marching band geek like me, DCI bands are practically rock stars, and they are a blast to see in real life. My favorite shows of the night were by the Madison Scouts and the Boston Crusaders.
My professor and postdocs have been gone for most of this week at a conference, but I have kept plenty busy on my own. I have been creating the same set of figures (plotted correlations, amplitude comparisons, beamforming, et cetera) for a variety of single-year correlations (some with high-amplitude traces taken out, some with time windows of earthquakes taken out, some correlated with different time spans, et cetera) in order to figure out which methods of correlation and stacking can do the best job of picking out waves. Once I declare a “winner” out of the many combinations of methods, which will hopefully happen early next week, I will apply it to the full eight years of seismic data.
I have continued to take advantage of the MIT environment by going to more lectures; they are usually only an hour apiece, so I can easily attend them without having to spend an extreme amount of time catching up on work later. On Monday, I attended a talk about cloud microphysics (the way that tiny bits of water, ice, and other materials behave in clouds) and found it very interesting. Tuesday afternoon, I went to a lecture about the use of proteins in polymer chemistry, and only barely understood what was going on, but still found what I did understand to be pretty neat. On Thursday, I listed to a materials science talk about growing perfect silicon crystals; it was also rather cool.
Last weekend, I decided to kick back for an afternoon at the beach! Revere Beach, near the end of one of Boston’s subway lines, was hosting a sand sculpting festival, so I saw some truly astounding artwork in addition to the normal beach sights of waves, shells, seagulls, dead stingrays, strange tourists, and slightly questionable food vendors.
I had a real adventure over last weekend: two days in Maine! The coastal cities there are only a few hours away by train, so I visited a friend there who used to live in Nebraska; I had not seen her in over a year. We happily reunited and she made sure I got the full Maine experience; we saw lighthouses, relaxed on the beach, poked around tide pools on rocky coasts, and ate local lobster and whoopie pies. It was really great to get out of the big city for a while, see an old friend, and experience a new place.
Back at the office this week, the search for body waves has continued. I have been doing a lot of beamforming, the process which takes a set of correlations and returns a plot showing the times and speeds for which there is wave propagation, over various windows of times and station separations, searching for bright blobs that indicate a wave. In this search, I get hints on where to look from SEIZMO’s TauP function, which gives the arrival time and speed of various seismic wave phases at any entered distance.
I took a break from my own research today to listen to a seminar from a professor doing research in nonlinear optics at Cornell. His group has been using silicon-based microresonators (rings of silicon compounds at which you point lasers) to make frequency combs (like a bunch of lasers all at once, each with a different frequency, and all of the frequencies are equal intervals apart). It was nice to take a break from being moderately confused by geophysics to be moderately confused by regular physics for a while.
I have also been working hard on my communicating with non-scientists project, a slide show and notes that would guide me in explaining my research in a presentation to the general public. In addition to improving my scientific communication skills as a whole, it is helping me come up with the elevator speech version of my research that I know I will have to repeatedly give to my classmates, professors, friends, family, and more upon my return to Nebraska. On the whole, it is an interesting and useful project!
Time until AGU abstract deadline: two and a half weeks
Total time left at MIT: one month
The waves are beginning to reveal themselves! Using the first year of our seismic readings, I have correlated the data, stacked the correlations, and then plotted the stacks into figures where the waves can actually be visible. In addition to simply looking at the plots of the stacks, I have also been searching for the body waves by using a handy process called beamforming, which involves taking the correlation stacks and the separation distances of their stations, and then searching for wave propagation at a range of velocities by using the inverse Radon transform. Different types of waves move at different speeds, so the phases present can be revealed by looking at the velocities that show motion. I show both a plot of stacks and an inverse Radon transform plot below.
So far, I have primarily been seeing the dominant surface waves instead of the more subtle body waves for which I am looking, but seeing any waves at all means I am generally on the right track. My first plan in making the body waves appear is changing the way I stack the correlations. Currently, we drop the “loud” time windows that might contain an earthquake by using a cutoff root-mean-square of each window’s trace, but this method can also remove windows of good data that are loud, such as during a big storm. Instead, I have just created and started testing a function that removes windows with earthquakes by comparing the times of the windows to a global catalog of earthquakes. It may appear that I should have been using this method since the start, but the other way is much easier to program and much quicker to use, so I hoped it would be enough. Regardless, the trouble I am facing with the new method is deciding how much time I should remove with each earthquake: too little would be taking in bad data, but too much would be ignoring good data.
Assignment #6: figures
I have uploaded two figures here: imgur.com/a/VGnzO . You can click on each one to zoom in on them. They have no figure titles and the axes titles are rather limited, but these are not meant to be poster-worthy; they are just an example of the step through which I am currently working on my way to some real answers. I made both figures in MATLAB, the first with SeismicLab’s wigb function and the second with MATLAB’s imagesc function. Descriptions of the figure contents are as follows:
Figure One - Each line in this plot represents the correlations from one pair of seismic stations. A wiggle in one line shows a wave arrival at one station when the other is considered to be a virtual source (that is basically how a correlation works). The correlation lines here are actually stacks of correlation traces from an entire year of data, giving them a much greater signal-to-noise ratio than individual traces. When arranged with respect to the separation distance between the stations (most of which happened to end up negative here because of how we set it up for East-West locations), a wave propagating at about 2.8 kilometers per second becomes quite clear; this wave is the surface wave. The waves for which I am actually looking are body waves, which would have greater speed and lower amplitude, but none are visible in this figure. Hopefully, I will have a figure showing body waves soon.
Figure Two - This colormap displays the inverse Radon transform of the correlation stacks mentioned above, and despite its radically different appearance from figure one, this figure shows much of the same information we gathered before. The inverse Radon transform (which I was able to do with the SEIZMO suite of seismic analysis tools for MATLAB from Washington University in St. Louis) starts with a chosen distance (I picked 300 kilometers separation) and looks for waves arriving over a range of times and propagating over a range of velocities. The bright blob on the graph says that a wave moving at angular slowness 38 seconds/degree (about 2.8 kilometers/second) crossed the 300 kilometer mark at around 110 seconds; this speed and time indicate that the blob is the same surface wave we saw in figure one. The blob is made of alternating high and low stripes and has more subtle stripes emanating from it because of interference from all of the individual correlation stacks. If this data had a strong body wave in it, there would be another blob with lower amplitude located above and to the left of the surface wave.
Assignment #5: Free write
Lately, I have been figuring out the best plan for dealing with large events, such as earthquakes, in our data. There are many ways large events can be found, such as comparing our time windows with times in a global earthquake catalog, or searching for anomalies in the amplitude or frequency of the raw traces or their correlations. Once found, the events can either be left alone, dampened so their amplitude matches the noise around it, or removed from the data entirely. I have been looking at the correlations after various methods of dealing with these events in an attempt to determine which way will help us most in seeing the body waves for which we are looking.
The trouble I am frequently facing now is the limits of our computer power and storage. It turns out that the correlations from one year of our seismic data take up nearly three terabytes! Beyond that, further processing of those correlations, such as finding the RMS and stacking the correlations, ends up using suchlarge amounts of memory that MATLAB and our processing cores sometimes get disagreeable. I have yet to be halted, though; I am so far able to split up and alter our functions in order to keep things manageable.
Even with all of these things to do, I have still found time for some fun. Last weekend, I visited Boston’s Museum of Science, where I saw musical Tesla coils, live tamarins, dinosaur skeletons, and more. I even bought a seismologically relevant souvenir at the gift shop: a t-shirt reading “it’s not my fault” with a picture of a seismogram trace. I am sure that wearing it to work will cause both smiling and the rolling of eyes. I have also learned that Boston is one of the best places in which to celebrate Independence Day. The threat of some serious storms pushed the city’s big Fourth of July festival ahead a day, but I was still able to enjoy the patriotic fun on the third. I hung out with some of my new friends and watched the city’s incredible fireworks display as it lit up the night over the Charles River.
Assignment #4: Discuss where the training wheels have been removed
This week, I feel like more has gotten done. I am more familiar with what I am doing, and I have laid enough foundation to get into bigger stuff. I have now written a function that will plot every stack of correlations pertaining to any one chosen station; each stack is plotted against time, and they are all arranged by the separation between stations, so waves present in the ambient noise are visible in their propagation outward from the first station, which acts like a wave source because of the correlation. So far, with the small amount of data I have fully processed, I am only able to see surface waves, but when I have correlated and stacked more data and found the right frequency range with which to filter, body waves should appear clearly.
Instead of the measly seven days of data I was using to write and test functions before, I now have access to the full 2,920 days we have for this project. I am certainly not using them all right away, though; I will continue using moderate amounts of data, perhaps a month’s worth or so, as I continue to adjust the parameters of my functions in order to extract the body waves most clearly. In order to process these new, larger amounts of data, I have also been given more computing power. Before, I was running things one at a time on the machine at my desk, but now I have access to a twelve-core cluster with which I can run my correlations more quickly in parallel.
I have also made strides in work-life balance lately. I went out to some neat places over last weekend, such as the New England Aquarium and Boston’s Museum of Fine Arts. I am a big fan of both cuttlefish and French impressionism, so getting to see both (and other cool things) in real life was amazing. There are many more interesting attractions in this city that are on my to-do list for free time in the future! I am progressing socially here, too; I was introduced to the department’s “cookie hour,” allowing me to meet more people and hear about more research, and I also made contact with some friends of mine that go to other colleges in the Boston area. In addition, I made some fun new friends by joining the MIT marching band as a bass drummer; the band is small, meets one evening a week, and has kindly allowed me to join them even though I will only be here for the summer. Despite the fact that I will be spending a lot of time alone at my computer for this research, it looks like I will be able to avoid becoming a total recluse.
My week has been chaotic, but interesting. First, being in a new city with new germs, I managed to catch a cold and was running on partial zombie mode for several days. Second, my parents, brother, and aunt arrived on Wednesday for their week-long visit. I have been working on my project during the day, then meeting up with them in the evening. We’re also going to spend the weekend together, seeing sights, going to museums, and more. While they were planning it months ago, I repeatedly expressed that I would be fine and they did not need to come visit me, but they insisted that they really wanted to and it would be a fun family vacation. I was right that I am fine and they did not need to visit, but I am glad they wanted to; we are having a great time here together.
Assignment #3: This week’s research and progress
I have been continuing the correlating and stacking of data as before, but have been focusing on a new step in between: sorting, which organizes the the correlated data by the pair of stations from which it came. This was happening before, as a part of my stacking step, but making it more separate and organized will smooth things along once we start working with larger and larger sets of data. With a sorted matrix of all of the traces for each pair of stations, I will be able to find events like earthquakes that should be removed before stacking much more easily, and I will be able to backtrack more quickly and easily in case of errors.
As I modify and test these functions, I occasionally find myself with some open time as MATLAB cycles through significant amounts of data. Luckily, I have useful tasks with which to fill this time: reading papers on similar research, looking up Wikipedia articles on signal processing, doing more computer tutorials, perusing the seismology textbook I received at IRIS orientation week, and drafting these blog entries.
I am getting better and better at MATLAB every day. As I run, edit, and create programs in it, I find myself getting stuck a lot less, asking the postdocs and/or Google for help less frequently, and being able to understand and implement solutions more quickly when I do need help. Perhaps most importantly, I am now much less afraid of doing something wrong in MATLAB, because I have learned that it will not get computer-angry at me, nor will it cause the total destruction of anything.
I also got some new tools to play with: SeismicLab, a suite of seismic data processing tools for MATLAB created by researchers in the physics department at University of Alberta. Having only just opened it up, I am not yet familiar with everything its gadgets are capable of or for what I will be using it, but I have been told that it has some great tools for graphing data.
Everything I have done so far is about understanding and improving the process that I will eventually be using to find body waves in the full set of ambient noise data. It seems like not a lot has happened yet, but I have a lot of foundation to lay before I can really start making tracks.
Things are starting to pick up here now that my professor and the postdocs are all back from their assorted adventures, so I have moved beyond just reading papers and watching computer tutorials. This week, I have spent a lot of time playing around with some of the existing MATLAB correlation and pre-processing programs, learning what they do, how they do it, and how it will be useful to me. I also wrote my first program, a simple method of stacking the correlations that come out of another program. My program is probably of questionable efficiency, since I am still quite new to this, but I can tell that in the process of making it, I have learned a lot about programming in MATLAB.
I have also begun to meet more of the people around here. I work in one corner of a large office, and the graduate students stationed in the other segments of the same office have all introduced themselves to me. When I mentioned to them that I was here through IRIS, I learned that two of the graduate students on this floor were also IRIS interns during their undergrad. Having such a network of IRIS alumni is really a cool thing.
Assignment #2: The datasets and tools with which I am working
The data with which I am working this summer is from the USArray, specifically from the Transportable Array stations between approximately 40° and 42° N. This continent-wide stripe runs from northern California on the west coast, through my hometown of Lincoln, Nebraska, and then on to my current location in Boston on the east coast. Someday, I should try to visit that area in northern California, just so I will have been on both ends and the middle of our area of interest. There are lots of researchers using USArray data, and for good reason; the USArray project is both widespread and comprehensive when it comes to how this country moves. We’re starting with the raw data, but luckily for me, most of the pre-processing functions have already been written and will only need parameter tweaking.
The primary weapon in my confrontation with the correlations is MATLAB. I will be using and editing code written by the postdocs and others, as well as writing some of my own, in order to piece together body waves from the ambient noise. Within MATLAB, I have use of the GISMO (Geophysical Institute Seismology MATLAB Objects) toolbox, a downloadable collection of functions designed for seismic trace analysis by University of Alaska Fairbanks professors. I have spent a fair bit of time over the past few days dissecting and playing around with GISMO’s correlations functions in order to figure out how best to apply them this summer.
It’s been somewhat of a slow start for me at MIT. My professor was out of town on Monday and Tuesday, and the two postdocs with whom I’ll also be working on this project are both away at a conference until next week, so I was at first left alone with papers to read and computer tutorials to watch. They were interesting papers and useful tutorials, but I’ll be glad when I have a full panel of people to whom I can ask questions. There are actually very few people around right now on my floor of the Green Building, the building where the Earth, Atmospheric, and Planetary Science department is located, so it's a lot quiet and a little creepy.
Assignment #1: What I would like to accomplish this summer
1. Be able to communicate with other professionals - I want to be able to understand the geoscience jargon well enough that I can adequately express what I am trying to accomplish to professors, grad students, postdocs, and more, as well as to be able to adequately understand what their research is about when they tell me. This skill is particularly vital in making connections with other researchers and other institutions, which will be favorable to me as I begin to explore options for graduate school. This is something on which I can improve throughout the summer and throughout my future, but I will be able to keep track of my progress along the way by evaluating my interactions with my group, members of other groups around me, and other researchers at conferences like AGU.
2. Be able to communicate with anyone - I want to be able to explain my project simply and clearly enough that non-geoscientists can understand, too. I want to tell everyone about this project, from my physics professors to my middle-school-aged cousins, and I need to find the right words to do it. I have a good start at this one, as I am already experienced in explaining my job at a condensed matter physics lab at UNL to the people around me. My ability to explain my project will improve as I learn more about it myself throughout the summer, following one of my favorite quotes by Albert Einstein: “If you can’t explain it simply, you don’t understand it well enough.”
3. Get the hang of MATLAB, Git, and more - I will be using computers in ways fairly new to me, and I would like to reduce my dependence on help from people and manuals as much as possible. I recognize how powerful these computer skills are, so I know it is in my best interest to have them on my side. I would like to be able to work more independently by one-third of the way through the summer, so that I will not be wasting the postdocs’ time with continuous questioning. By the end of the second third of the summer, I want to reduce my dependence on online help, so that I can increase my own productivity.
4. Get the hang of reading the literature - The papers which I have started reading here are quite different from the condensed matter physics papers I read at the lab in which I work at UNL. It will take a lot time and effort, reading and re-reading, to understand the material in the papers and to learn the best ways to get the most out of them. By one-third of the way through the summer, I want to be able to explain the papers I already have, and be able to get the basic gist of any new papers I get after only a couple of read-throughs.
5. Ask at least one good question every day - I think this will be exceptionally helpful throughout the summer, and it is also just a good rule to live by.
6. Go to the New England Aquarium sometime - What can I say? I just really love aquariums. Seriously, though, work-life balance will be important this summer, because I want to get a lot accomplished, but remain a happy, functioning human being while doing so. Some of my ideas for fun include getting a library card and reading sci-fi, listening to the Boston Pops play on Independence Day, seeing if some of my friends that go to college in the area are around to hang out, taking in some of the local history, and more.
Orientation test blog post:
Hiking rocks. Sore feet do not rock. Lizards rock. Flies do not rock. Matlab rocks. Golfing does not rock. New Mexico Tech rocks. Geology rocks. Rocks rock.