**The Context**

In my last post on this topic, we created a function on excel to help us project global acerage of cotton planting.

In this post, I take the next steps for actually forecasting what acerage could be in the coming years using a simple Matlab code. The goal here is to specify our variables (that I largely covered in the last post), and to project this function into the future, say, 10 years. I then have Matlab run the simulation a thousand times to allow us to calculate the average projection for each year into the future. Now, in our case, keep in mind that with a mean reverting function should produce projections around the mean, so why proceed anyways? Two reasons: Firstly, current plantings are well below long term averages, so this analysis could help us project how quickly it will revert to the mean, and the probability distribution of plantings above or below certain levels. Secondly, we just covered this topic in my stochastic calculus class and I wanted to try it out.

**Are the data normal?**

The first thing I did was verify that the variable we are considering can indeed be described as normally distributed, as the function we created depends on this assumption. A few ways to do this are with a Jarque-Berra test, a qq plot, or visually with a histogram. As we can see from the below graphs, as well as the results of the JB test not shown, the distribution of cotton plantings is indeed more or less what we would expect if the variable was normally distributed. Knowing this, we can proceed with our analysis. Often times, and especially with financial assets, the period-on-period change is used rather than the actual level, but since in our case both distributions look “relatively” normally distributed, its safe for us to proceed directly projecting absolute levels.

**The Monte Carlo Simulation**

With this small piece of the puzzle done, we can now look at how this acerage will affect price movements over the next year. But first, let’s take a look at the results:

The first graph here looks a bit crazy, but it is a good visual representation of what I’ve done – 1000 simulations of different trajectories 10 years into the future, starting out with this year’s estimated acerage in 2017/18E.

Graphic of all 1000 simulations

**The Results**

Here is perhaps a more useful one, a line chart of the average projections for each year into the future. If we recall back to part 1 of this series, the function we used is a mean reverting one. Therefore, it makes sense that a Monte Carlo projection centers around the long term mean. What is perhaps more interesting for us is the distribution of results we get.

Monte Carlo projection

Below is a histogram of both our projections for next year, as well as the projection for 2026/2027. Again, a mean reverting function will have a fairly concentrated distribution, but it is interesting to note the lower ranges for next year compared to the future, as global cotton plantings this year are well below the normal long term average.

Distributions for projections of 2018/19 & 2026/27

Finally, with this data in hand, we can actually get to more useful and practical applications, such as constructing probability intervals for the coming year. **Our average projection for 2018/19 is 31,893,000 hectares to be planted**, with a confidence interval of (31,440 ; 32,346) at a 95% level of confidence. Note that this does not mean I claim with 95% certainty that the actual number of hectares planted will be within this range. The technical interpretation is that we can be 95% certain that the true average *projection* is in this interval – it corrects for sampling error, not modelling error! (The error that the model does not perfectly reflect the variable in question, which, of course, no model does).

**Conclusion**

Overall, this was definitely overkill for such a small piece of my overall database. More than anything it was a good opportunity to become familiar with the USDA data, and try out some stuff I’ve been learning in class. But it’s a good exercise to get things started, and may also help me when it comes to projecting global production, and then prices. This will be the next piece in building my global database, so until next time!

**The Code: For those who are interested…**

*% % Example Monte Carlo Simulation for cotton plantings worldwide*

*% *

*% Created in December 2016 by Alex Molson*

*% Simple Monte Carlo simulation to project 1000s of hectares of cotton planted*

*%Checking for normality*

*qqplot(Hectares)*

*h=jbtest(Hectares)*

*h1=jbtest(Change)*

*z=diff(Hectares)*

*qqplot(z)*

*figure*

*subplot(2,1,2)*

*histfit(z)*

*title(‘Change in hectares’)*

*hold on*

*subplot(2,1,1)*

*histfit(Hectares)*

*title(‘Total hectares’)*

*hold off*

*%Section 1: Specifying variables*

*S0=30488; %Current year’s cotton plantings*

*Stdev=1633; %Assumed volatility in yearly plantings*

*k=0.65; %Assumed speed of adjustment coefficient*

*Avg=32602; %Assumed long term average for cotton plantings*

*n=11; %How many years I would like included in the analysis (including current year)*

*n1=1000; %How many trials of simulation I would like to run*

*Stch=normrnd(0,Stdev,[n,n1]); % Creates the stochastic gyrations*

*Est=zeros(n,n1); %Creates a matrix of zeros that we will populate with our estimates*

*Est(1,1:n1)=S0; %Forces first value for all trials to be current year’s*

*%This small section actually integrates the function for each row value in*

*%each trial after S0*

*for i=2:n *

*for j=1:n1*

* Est(i,j)=Est(i-1,j)+(k*(Avg-Est(i-1,j)))+Stch(i,j);*

*end*

*end*

*%Creates a small plot of all our trials*

*plot(Est);*

*set(gca, ‘YTickLabel’, num2str(get(gca,’YTick’)’,’%d’))*

*set(gca,’XLim’,[1 n])*

*set(gca,’XTick’,[1:1:n])*

*set(gca,’XTickLabel’,[‘2016/17E’;’2017/18E’;’2018/19E’;’2019/20E’;’2020/21E’;’2021/22E’;’2022/23E’;’2023/24E’;’2024/25E’;’2025/26E’;’2026/27E’])*

*set(gca, ‘XTickLabelRotation’,[45])*

*%Creates a vector matrix containing the average simulation values for each*

*%year in the future, then creates a small plot of it*

*M=mean(Est,2);*

*plot(M);*

*set(gca, ‘YTickLabel’, num2str(get(gca,’YTick’)’,’%d’))*

*set(gca,’XLim’,[1 n])*

*set(gca,’XTick’,[1:1:n])*

*set(gca,’XTickLabel’,[‘2016/17E’;’2017/18E’;’2018/19E’;’2019/20E’;’2020/21E’;’2021/22E’;’2022/23E’;’2023/24E’;’2024/25E’;’2025/26E’;’2026/27E’])*

*set(gca, ‘XTickLabelRotation’,[45])*

*disp(M); %Displays the matrix with our estimates*

*P1=Est(2,1:n1);*

*P10=Est(10,1:n1);*

*h1=histogram(P1)*

*hold on*

*h2=histogram(P10)*

*title(‘2018/19E vs 2026/27E’)*

*%%Confidence Interval*

*Stdev1=std(P1);*

*ConfInt=[M(2,1)-(1.96*Stdev/(50^0.5)), M(2,1)+(1.96*Stdev/(50^0.5))]*

*ConfInt2=[M(2,1)-(1.645*Stdev/(50^0.5)), M(2,1)+(1.645*Stdev/(50^0.5))]*