Stock Price Prediction using Long Short Term Memory Networks (LSTMs) — Approach I

Naveenraj Kamalakannan
4 min readApr 28, 2021

Peek into my Github Repository
Have a look at my Google Colaboratory
Support Me at Buy Me a Coffee

Over the past decade, there has been a surge in the engagement of people in Share Markets and Trades, around the globe. Investments in Equity, Futures & Options, Commodities, Bitcoins, etc. are seen as a remunerative platform that yields lucrative returns. What if we can predict the Stock Prices beforehand or at least predict the trend of the Stocks, irrespective of the current economic scenario and the corporate actions?

P.S: Okay, I don’t own a Time Machine! :D

Want to read this story later? Save it in Journal.

In this Story, we’ll be building a Forecasting Model to predict the trend of a particular stock using Recurrent Neural Networks — LSTM. We will use an Adam Optimizer, Dropout Regularizer, and MSE Loss Function. Phase I of the story discusses the Exploratory Data Analysis on a given Stock using various Plot Functions including but not limited to SNS Plots, Scatter Plots, and Box Plots.

Phase I — EDA on Stock Data

  1. Getting Data from Reliable Sources
from pandas_datareader import data
df = data.DataReader("RELIANCE.NS", data_source = "yahoo", start = "2010-04-28", end = "2021-03-28")
df.head()

We will obtain the Stock Data in Pandas DataFrame Object. Now, it’s a convention to check for any anomalies / NAN Values and take the Data through a Cleaning Process.

df.isna().any()

This is the most simple and elegant way of cleaning data. Since the data is from a trustable source, We don’t see any Anomalies.

Note: There are many Data Cleaning Algorithms and it is a good practice to use one before proceeding further.

Rolling Mean

Technically Speaking, The Rolling Mean is the mean of an n-sized window sliding from the beginning to the end of the data frame. It Smoothens out short-term fluctuations in the time-series data. If the succeeding price is greater than the current rolling mean, then the trend is UP!

df.rolling(window = 5).mean()
#df.rolling(window = 5, min_periods = 3) -> Requires atleast 3 arguments in a window to compute the mean
rolling mean window size = 5

Plotting Functions

i) Using simple plot function

simple plot function
output graph

ii) Volume Data — Line Plot

volume line plot using seaborn

An Economic Times Article said:

Analysts said Reliance Industries’ largest FDI deal with Saudi Aramco, the roadmap to become a zero net-debt company by March 2021, the unveiling of four new growth verticals and plans to unlock value in retail and telecom businesses had investors going gung ho on the stock.

High Stock Volumes were traded from the mid-2020 till early 2021. A sharp decline and a rapid increase can be seen from 2020 to 2021. This shows that the forecasting depends not only on the Open & Close price but also on the Volume Traded, Corporate Actions, Other Stock Prices, and Economic Situation.

iii) Auto Correlation Plot

from pandas.plotting import autocorrelation_plot
df_sampled = df.resample("BM").mean()
autocorrelation_plot(df_sampled["Close"])
autocorrelation of Close Price resampled to BM

An Autocorrelation Plot tells us how the current value of the time-series data is related to the previous values by considering trend, seasonality, cyclic and residual. We can infer that more than 50% of the data line shows a significant correlation with time. A Recurrent Neural Network output will stand as a notable feature in Forecasting the Prices.

iv) Seasonal Line Plot

seasonal price variation

We don’t infer any significant features for this particular case, although we can see a price surge and subsequent peaks after the month of June, every year.

v) Box Plot

As revealed by the seasonality line plot, the seasonality box plot depicts the trading range which is seen maximum after June. Whereas the Trend plot depicts the annual price variation. In the year 2020, the width seems to be pretty high ranging from 800 to 2000+. As we all know, it was a market crash.

vi) Time-Series Decomposition

seasonal decomposition of time series data

Seasonal Component: Shows the recurring “normal” variations i.e. the ups and downs of time-series data.

Trend Component: This shows the pattern in the data that spans across seasonal periods.

Residual Component: After Decomposition of Time Series into Trend & Seasonal Component, what’s leftover becomes the Residual.

Now that we’ve done our EDA for this Time Series Data, Let’s build our LSTM Model and Start Predicting Stocks. Not here, but in Part 2…Coming Soon!

I will also soon be publishing a similar article on GeeksforGeeks. Link will be updated soon :)

Stay Tuned, Love!

📝 Save this story in Journal.

--

--

Naveenraj Kamalakannan

A resolute programmer of Python and Java. Worked in Android Apps and ML model deployment. More strong in Data Analytics and Bioinformatics. I love to Code :)