Relative trading strategy on soy/cotton futures

I was mulling around Bloomberg the other day when I came across the agricultural mapping function. It’s not really developed enough to be usable beyond a regional level, but it got me thinking about what the empirical evidence says on the link between soybean and cotton futures.

Empirical evidence of a link

Theoretically, there should be some sort of link given that there is a large overlap in the growing areas of soy and cotton, particularly in the Mississippi delta area in the southern US. And, since the futures are on US deliveries, naturally there should be some sort of lagged link: if soybean prices are relatively higher one year, we might expect more soy planting the following season with at least some of this at the expense of cotton. This isn’t to claim necessarily some sort of direct correlation, but we can assume that there are forces which keep the ratio of prices between the two to be within some range. Otherwise, farmers would switch from one crop to the other (although this process may take up to a year for farmers to react).


Indexed prices of the most active cotton and soybean futures contracts.

Firstly, we can take a look at the active futures price for each. Clearly there is some link, but it’s also obvious that there is room for wild swings. It would also seem at first glance that the correlation between the two is increasing. Interestingly though, the distributions of the two assets are quite different. Cotton futures prices seem to follow a lognormal distribution, while soybeans are more uniform.chart-11

But while their distributions are different, the ratio between the two is quite stable. So I took a look at what the ratio of active soybean futures contract prices to cotton futures prices looks like. Looking at the autocorrelation results of this ratio, we can see it is most definitely not the product of a random walk. This leads me to believe that there is at least some predictive power in this ratio.


Autocorrelation function of the soy/cotton futures ratio.

The trade idea

Which leads me to my actual trade idea. I believe that the ratio of soybean/cotton futures prices is stable enough on a weekly basis to form a solid trading strategy during periods of low to moderate volatility.


In the above graph, the blue line represents the ratio, and the straight solid blue line represents the average of the ratio over this period. The yellow and grey horizontal lines represent 1/2 standard deviations above and below the average respectively. Finally, the orange line in the above graph is another mean reverting GMB function (similar to the one covered in an earlier post and the second part). Important to note as well, however, that the data seen above skips the period of massive volatility in cotton prices in 2010/2011.

As we can see, over the past 5-6 years the average has oscillated around 15.6 , and has gone as high as ~24 and as low as ~12. During periods of relatively low volatility, such as right now before the important spring data releases, I think it is worth trading on this relation holding. Entry/exit points at the 1/2 standard deviation mark provide a good balance between transaction costs and trading levels. Stop loss levels should be placed not too far away from these bounds.

In order to put down roughly equal amounts on each commodity, a ratio of 3:2 cotton to soy contracts, for a total notional of about $200,000 would roughly produce offsetting trade legs. Going long 2 soy contracts and short 3 cotton contracts (for a little under $15,00 in margin) could deliver perhaps $14,850 gross profit should our ratio returns to its historic levels; which seems to happen within a cycle of about 6 months. I didn’t look at further dated contracts, so it would obviously involve rolling over the positions several times.


Of course, there is the large risk that dynamics between the two commodities change, and if the basis actually increases between the two there would be a risk of large losses. From what we saw with the autocorrelation function, the possibility of such a violent basis change should be fairly low, especially during a time of few data releases. Still though, while stop losses could minimize this risk, but the possibility of sudden and violent volatility should never be discounted!






Forecasting global acerage (part 2)

The Context

In my last post on this topic, we created a function on excel to help us project global acerage of cotton planting.

In this post, I take the next steps for actually forecasting what acerage could be in the coming years using a simple Matlab code. The goal here is to specify our variables (that I largely covered in the last post), and to project this function into the future, say, 10 years. I then have Matlab run the simulation a thousand times to allow us to calculate the average projection for each year into the future. Now, in our case, keep in mind that with a mean reverting function should produce projections around the mean, so why proceed anyways? Two reasons: Firstly, current plantings are well below long term averages, so this analysis could help us project how quickly it will revert to the mean, and the probability distribution of plantings above or below certain levels. Secondly, we just covered this topic in my stochastic calculus class and I wanted to try it out.

Are the data normal?

The first thing I did was verify that the variable we are considering can indeed be described as normally distributed, as the function we created depends on this assumption. A few ways to do this are with a Jarque-Berra test, a qq plot, or visually with a histogram. As we can see from the below graphs, as well as the results of the JB test not shown, the distribution of cotton plantings is indeed more or less what we would expect if the variable was normally distributed. Knowing this, we can proceed with our analysis. Often times, and especially with financial assets, the period-on-period change is used rather than the actual level, but since in our case both distributions look “relatively” normally distributed, its safe for us to proceed directly projecting absolute levels.


The Monte Carlo Simulation

With this small piece of the puzzle done, we can now look at how this acerage will affect price movements over the next year. But first, let’s take a look at the results:

The first graph here looks a bit crazy, but it is a good visual representation of what I’ve done – 1000 simulations of different trajectories 10 years into the future, starting out with this year’s estimated acerage in 2017/18E.


Graphic of all 1000 simulations

The Results

Here is perhaps a more useful one, a line chart of the average projections for each year into the future. If we recall back to part 1 of this series, the function we used is a mean reverting one. Therefore, it makes sense that a Monte Carlo projection centers around the long term mean. What is perhaps more interesting for us is the distribution of results we get.


Monte Carlo projection

Below is a histogram of both our projections for next year, as well as the projection for 2026/2027. Again, a mean reverting function will have a fairly concentrated distribution, but it is interesting to note the lower ranges for next year compared to the future, as global cotton plantings this year are well below the normal long term average.

Chart 6.jpg

Distributions for projections of 2018/19 & 2026/27

Finally, with this data in hand, we can actually get to more useful and practical applications, such as constructing probability intervals for the coming year. Our average projection for 2018/19 is 31,893,000 hectares to be planted, with a confidence interval of (31,440 ; 32,346) at a 95% level of confidence. Note that this does not mean I claim with 95% certainty that the actual number of hectares planted will be within this range. The technical interpretation is that we can be 95% certain that the true average projection is in this interval – it corrects for sampling error, not modelling error! (The error that the model does not perfectly reflect the variable in question, which, of course, no model does).


Overall, this was definitely overkill for such a small piece of my overall database. More than anything it was a good opportunity to become familiar with the USDA data, and try out some stuff I’ve been learning in class. But it’s a good exercise to get things started, and may also help me when it comes to projecting global production, and then prices. This will be the next piece in building my global database, so until next time!


The Code: For those who are interested…

% % Example Monte Carlo Simulation for cotton plantings worldwide
% Created in December 2016 by Alex Molson
% Simple Monte Carlo simulation to project 1000s of hectares of cotton planted
%Checking for normality



title(‘Change in hectares’)
hold on
title(‘Total hectares’)
hold off

%Section 1: Specifying variables
S0=30488; %Current year’s cotton plantings
Stdev=1633; %Assumed volatility in yearly plantings
k=0.65; %Assumed speed of adjustment coefficient
Avg=32602; %Assumed long term average for cotton plantings
n=11; %How many years I would like included in the analysis (including current year)
n1=1000; %How many trials of simulation I would like to run

Stch=normrnd(0,Stdev,[n,n1]); % Creates the stochastic gyrations

Est=zeros(n,n1); %Creates a matrix of zeros that we will populate with our estimates
Est(1,1:n1)=S0; %Forces first value for all trials to be current year’s

%This small section actually integrates the function for each row value in
%each trial after S0
for i=2:n
for j=1:n1

%Creates a small plot of all our trials
set(gca, ‘YTickLabel’, num2str(get(gca,’YTick’)’,’%d’))
set(gca,’XLim’,[1 n])
set(gca, ‘XTickLabelRotation’,[45])

%Creates a vector matrix containing the average simulation values for each
%year in the future, then creates a small plot of it
set(gca, ‘YTickLabel’, num2str(get(gca,’YTick’)’,’%d’))
set(gca,’XLim’,[1 n])
set(gca, ‘XTickLabelRotation’,[45])

disp(M); %Displays the matrix with our estimates

hold on
title(‘2018/19E vs 2026/27E’)
%%Confidence Interval
ConfInt=[M(2,1)-(1.96*Stdev/(50^0.5)), M(2,1)+(1.96*Stdev/(50^0.5))]
ConfInt2=[M(2,1)-(1.645*Stdev/(50^0.5)), M(2,1)+(1.645*Stdev/(50^0.5))]

Forecasting global acerage (part 1)

The Context

As with any commodity, I’m starting my analysis by building up my various databases. In this case, I’m mostly using USDA annual reports, which compile detailed data by country on acerage, yields, stocks, production, trade, and consumption. It’s a lot of copy pasting, similar to what I used to do when I worked as an equity analyst. One can only imagine how many times the same work is done around the world in our field…

In any case, in the process of building up the database I also naturally took to a bit of forecasting. The idea will be to eventually build up an idea of total supply and demand. The first step I’ll be taking in this forecasting will be the projection of global acerage devoted to cotton planting. In part 1 here, we will be creating a function to help us project future acerage, which we will then use in a Monte Carlo simulation later on, perhaps in part 2.

Creating our Function

Of course, acerage (hectarage, technically) devoted to cotton will undoubtedly reflect a combination of observed cotton prices, planters’ expectations, and the relative competitiveness of other crops. Clearly it would be very difficult to  effectively include all these variables to come up with a precise forecast. In fact, if we are hoping to make a forecast of cotton prices, and plan on using acerage as one of our inputs, then we can’t also use prices in projecting acerage – it would be a recursive calculation. It would also be a bit of overkill for what is, in the broader scheme of things, a small part of my analysis.

Instead, to get things started I’ll be forecasting global acerage using a mean reverting geometric brownian motion function, or, at least a kind of GBM, of the form:


Here, we project next year’s acerage based on last year’s figure (Xt), the speed of adjustment coefficient (k chapeau, as it is called in french; this determines how quickly the function will revert to its mean), the long term average of global acerage (mu chapeau), and a stochastic error term (which itself is distributed according to the volatility of yearly acerage). Last year’s prices are observable, but we’ll need to estimate k chapeau, mu chapeau, and the standard deviation of our stochastic error term.

We could do this using a simple regression (of Xt+1 on Xt) on excel or Matlab, but for this particular data series I’ve decided to just wing it: in our case, the volatility of annual planting is growing as farmers become more methodical about crop planting, so the regression results are a bit misleading. We should be able to produce a reliable function in any case.

For our model, I simply use the average acerage of our data set as the long term average (I use USDA FAS data here going back to 1966), and have estimated k chapeau at 0.65 (for eyeballing this coefficient, we can calculate the half-life of our mean reversion, which for 0.65 is about HL=ln(2)/0.65 = 1.06, so we would expect the difference between annual acerage and the long term average to reduce by about half within a period of just over one year). Standard deviation is just calculated as the annual volatility for our data sample, from 1966.

So my final function on excel was:


As for what this means in projecting global acerage, we now have the building blocks for Monte Carlo or other statistical analyses. Below we can see a sample of projections from the function: a Monte Carlo analysis would involve repeating the experiment ten or a hundred thousand times and extracting probabilities from these results.


A sample of projections from our function for cotton hectares planted globally.

So there we have it, a function to help us project annual cotton acerage, which will, in turn, be useful in calculating global production and supply. Of course, these remain just rough estimates; Putting too much emphasis on models is dangerous, but for my purposes in getting a grip on the dynamics of the cotton market, we’re off to a good start!

Range Based Trading ahead

Historical volatility in No.2 futures contracts since 2012 has been about 36% annualized, but we have seen a sharp decline in volatility since 2015, over which time it has averaged about 15%. Certainly as with any market, and with commodities in particular, the risk of large and sudden swings in volatility is present, but the period of relative calm also presents a window of opportunity. Since the spike in August trading has been relatively range bound, and I predict this will continue until the spring data and projection releases beginning in March. The No.2 futures contract began a slight correction several days ago, but somewhere around 71-71.5 should present an attractive entry point for the next several weeks. Look for a subsequent exit point just below 75.


No.2 futures price and projection

Futures continue march upwards

Cotton futures continued their march upwards on Wednesday afternoon, having undergone some selling earlier in the day. This continues the trend we have seen over the last several months, with the market now approaching highs seen in August.



Cotton No.2 futures price (ICE,

In fact, despite an almost equal number of losing and winning days, No.2 futures contracts are up over 3% m/m.


Reminder – USDA Cotton Ginnings

As a quick reminder, the USDA will be releasing its next Cotton Ginnings report this Thursday, January 12th here. There will be a second January release on the 23rd as well. The Cotton Ginnings report compiles the number of running bales ginned to date for all cotton and American-Pima cotton by state; and by county ginnings as well for the mid-month state ginnings