Here, you'll explore an inductive method for scientific inquiry: based on data, e.g., as provided by earth.nullschool.net, you'll explore possible correlations. Here, we'll start out with a focus on cyclonic synoptic systems. These systems are atmospheric patterns hundreds of kilometers in width, that you can easily recognize by the way the wind is blowing: in the Northern hemisphere, cyclonic winds circle in a counter-clockwise direction around a center point. Cyclonic synoptic systems are generally correlated with stormy weather, strong winds, clouds and rainfall, and low atmospheric pressure at their centers. Strong cyclonic systems -- that is, systems with especially low pressure, lots of rainfall, and fast winds, are called tropical storms. In extreme cases, tropical storms are categorized as hurricanes or typhoons.
If you go to the Southern Hemisphere, all the above is the same, except cyclonic winds circle in a clockwise direction.
While we're at it, we may as well define anticyclonic synoptic systems. These are the opposite of cyclonic systems: clockwise wind circulation in the Northern Hemisphere, and counter-clockwise in the Southern Hemisphere. Anticyclonic synoptic systems are correlated with fair weather, subsiding air, clear skies, dry conditions, and high atmospheric pressure at their centers.
Synoptic systems often move around over the course of a few days, but there's a special class of synoptic systems called semi-permanent -- they tend to hang around, or return to, the same location, sometimes for centuries. Semi-permanent anticlonic systems in the Pacific are thought to contribute to mega-drought conditions in California and the Atacama desert, for example.
We've prepared some data for this module, to get things started. The information came from earth.nullschool.net, on a particular day (August 8, 2021), after a search for cyclonic systems. For each such system, two data were noted:
These data were then recorded, one line for each location, in a file called 'dataset1.txt'. The first line of this file (beginning with #) contains metadata about the dataset -- extra information, like the units of the pressure and wind speed. Numpy's loadtxt function ignores these lines when it loads the data into python.
As preparation for this module, you should have look at 'dataset1.txt' and make a note of the metadata: what are the units, and which column refers to what information? If you really want to get ahead, you could walk through this notebook! That will involve writing or modifying a little code, and creating your own datasets for subsequent investigation.
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook
The next cell loads an example dataset (taken from earth.nullschool.net), and (using numpy's shape function) reports the number of rows and columns in the dataset.
After that comes slicing. Slicing consists of extracting a single column or a single row of a grid of data. For a two-dimensional array, the first index tells you the row number, and the second index tells you the column number. A colon (":") means the entire row or column. So when you see commands like 'data[:,0]', it means 'give me all the rows in the first column of 'data'.
# Load the data
filename = 'dataset1.txt'
data = np.loadtxt(filename)
print(np.shape(data))
# Slice out the pressure
pressure = data[:,0]
print(pressure)
# Slice out the max wind speed
maxwindspeed = data[:,1]
print(maxwindspeed)
# Initialize the plot window
plt.figure()
# Plot
plt.plot(pressure,maxwindspeed)
# Annotate
plt.xlabel('pressure (hPa)')
plt.ylabel('max wind speed (km/s)')
title = 'placeholder title'
plt.title(title)
plt.grid(True)
(6, 2) [ 999. 985. 992. 998. 1010. 984.] [75. 86. 58. 44. 54. 82.]
In the cell below, duplicate the contents of the cell above, but fix a few things:
### BEGIN SOLUTION
# Load the data
filename = 'dataset1.txt'
data = np.loadtxt(filename)
print("The shape is", np.shape(data))
print(data[:,0])
# Slice out the pressure
pressure = data[:,0]
print(pressure)
# Slice out the max wind speed
maxwindspeed = data[:,1]
print(maxwindspeed)
# Initialize the plot window
plt.figure()
# Plot
plt.plot(pressure,maxwindspeed,'o')
# Annotate
plt.xlabel('pressure (hPa)')
plt.ylabel('max wind speed (km/s)')
title = 'Low-pressure systems from earth.nullschool.net, 8 August 2021'
plt.title(title)
plt.grid(True)
### END SOLUTION
The shape is (6, 2) [ 999. 985. 992. 998. 1010. 984.] [ 999. 985. 992. 998. 1010. 984.] [75. 86. 58. 44. 54. 82.]
Assuming you've detected what seems to be a trend in the scatter plot above, use the cell below to articulate one or two ideas about how dependable that trend is, and, and how you might go about improving that situation.
In the cell below, repeat the above analysis, using your own low-pressure vs wind speed data (which you should store in 'dataset2.txt'). This means creating dataset2.txt, populating it with pressure & wind speed data you get from earth.nullschool.net, then loading the data into Python using np.loadtxt and making a scatter plot like the one above.
### BEGIN SOLUTION
# This loads the 1st dataset again, but the student should be loading in dataset2.txt
filename = 'dataset1.txt'
data = np.loadtxt(filename)
print("The shape is", np.shape(data))
print(data[:,0])
# Slice out the pressure
pressure = data[:,0]
print(pressure)
# Slice out the max wind speed
maxwindspeed = data[:,1]
print(maxwindspeed)
# Initialize the plot window
plt.figure()
# Plot
plt.plot(pressure,maxwindspeed,'o')
# Annotate
plt.xlabel('pressure (hPa)')
plt.ylabel('max wind speed (km/s)')
title = 'Low-pressure systems from earth.nullschool.net, 8 August 2021'
plt.title(title)
plt.grid(True)
### END SOLUTION
The shape is (6, 2) [ 999. 985. 992. 998. 1010. 984.] [ 999. 985. 992. 998. 1010. 984.] [75. 86. 58. 44. 54. 82.]
Use the cell below to articulate one or two observations regarding the results in dataset2 vs those in dataset1.
Don't forget to
In the cell below, carry out another analysis from data derived from earth.nullschool.net. You have a lot of freedom to decide what those data will be, but the idea is that the result will be a scatter plot of some kind that displays a correlation. Here are some ideas ...
### BEGIN SOLUTION
# This loads the 1st dataset again, but the student should be loading in dataset2.txt
filename = 'dataset1.txt'
data = np.loadtxt(filename)
print("The shape is", np.shape(data))
print(data[:,0])
# Slice out the pressure
pressure = data[:,0]
print(pressure)
# Slice out the max wind speed
maxwindspeed = data[:,1]
print(maxwindspeed)
# Initialize the plot window
plt.figure()
# Plot
plt.plot(pressure,maxwindspeed,'o')
# Annotate
plt.xlabel('pressure (hPa)')
plt.ylabel('max wind speed (km/s)')
title = 'Low-pressure systems from earth.nullschool.net, 8 August 2021'
plt.title(title)
plt.grid(True)
### END SOLUTION
The shape is (6, 2) [ 999. 985. 992. 998. 1010. 984.] [ 999. 985. 992. 998. 1010. 984.] [75. 86. 58. 44. 54. 82.]
We're at the end of the notebook. You should repeat the "Three steps for refreshing and saving your code" you did before.
Find the "Validate" button and press it. If there are any errors or warnings, fix them.