Category Archives: Data-driven decision making

It’s a long way to the top (if you wanna rock ‘n’ roll)

Yes, it’s a long way to the top, especially if you want to publish your first paper in a journal ranked in the first quartile of Economics. But, after a year of hard work, good news arrived at the end of 2016. The International Journal of Forecasting decided that my first paper, Empowering cash managers to achieve cost savings by improving predictive accuracy, deserved publication. It has been a long way to the top, but I wanted rock ‘n’ roll. At several moments of the process of writing, submission and revision, I really thought that I was on a highway to hell.

The underlying idea of the paper was simple: if predictive accuracy is a good thing, then cost savings in cash management must be correlated with better forecasts.  Even though the idea was simple, the process to reformulate the idea as a convincing message was not so easy. After providing the necessary background about the cash management problem, we introduced the main characteristics of the cash flow data sets used in the experiments. Then, we proposed five different forecasters for comparative purposes: autoregressive, linear regression, radial basis functions, random forests and a seasonal interaction model (the latter suggested by a reviewer).  As an evaluation algorithm we relied on a time-series cross validation procedure by Hyndman and the winners were… random forests (for data set 1) and linear regression (for data set 2).

The crucial question came next: do better forecasts produce better cash management policies?  We expected so, but we had to show it empirically. To this end, we implemented a recent cash management model presented in the literature that used cash flow forecasts as a key input. We tested a wide range of cash flow forecasts with different accuracies that resulted in a thunderstruck  or, in other words, in an empirical confirmation of the savings hypothesis: better forecasts produce better cash management policies in terms of cost.  As a final contribution, we proposed a general methodology to help cash managers estimate whether their efforts in improving the predictive accuracy are rewarded by proportional cost savings.

The work was done and we had no other thing to do than waiting for the good news (if any). The e-mail confirmation arrived at the end of 2016 and I felt like if  you shook me all night long.

Thanks to AC/DC for providing the appropriate songs to illustrate my thoughts.

Advertisements

Facing business problems from a data point of view

Almost every aspect of business is open to data collection. Data will be later used to obtain useful information which can be of help to improve quality of decision-making. You can even increase quantity of decision-making by automating decision processes that have to be performed on a regular basis. From data, information technology can be used to report periodically (database querying perspective), to describe aspects of interest (basic statistics perspective) or to find patterns (data mining perspective). All of these aspects are closely linked to each other and, frequently, more sophisticated techniques such as data mining rely on more basic techniques such as database querying. Extracting useful knowledge from data requires a structured approach that helps us to find the right way to the goals we pursue. One can face problems without any methodology and succeed but it is always useful to have a map when you go to the mountains. When facing business problems, a very useful framework can be found in CRISP-DM (Cross Industry Standard Process for Data Mining). Business understanding, data understanding, data preparation, modeling, evaluation and deployment are the main phases in which a DM project can be broken down. We will come back later (hopefully in a future post) to CRISP. Keeping in mind that a method is available for us is enough for now. The data perspective leads us to transform raw data into features that better represent the problem we are dealing with. Let’s see this with an example. Suppose you are an IT manager of a medium-size company and suppose you are told to suggest different alternative processes to reduce cash management costs using available IT resources of the company. Yes, my friend, it is all about money! It is also very likely that you barely know how cash managers deal with receipts and payments day after day. The only thing you know for sure is that they are always angry. What is the first thing you can do to face cash management problem from a data point of view? Think for a while… yes, you are right! The first thing you can do is writing down a complete definition of the process. How? The more data you use, the better. Cash is the life blood of a company and has different origins and different ends. Incoming cash comes from customers, banks, shareholders or even from public institutions in the form of grants or subsidies. Outgoing cash goes to vendors, employees and, again, banks, public institutions and shareholders in the form of taxes and dividends respectively. Incoming and outgoing, receipts and payments, collections and disbursements, inflows and outflows… At least, it seems there is a clear thing here: cash management is a problem with two dimensions, with two directions, “from” the company and “to” the company. Ok, one step closer to the finish line. Let’s move on to the next one. An exploratory analysis of available data will show you that, apart from identifying cash flow related entities such as customers and vendors, a number of interesting features can be used to get a deeper knowledge of the problem. Currencies, exchange rates, payment modes, payment terms, transaction dates, country of origin of customers and vendors are good examples of what we may be interested in. Ultimately, cash flow management is an exercise of speeding up collections to do payments on time. In this sense, trying to predict the future will likely help you to make the right decision in the present. This is not much different to predicting the customers’ payment behavior and predicting our own payment behavior. Surprisingly, sometimes it is easier to do the former than the latter.

Here you can find two useful references related to this post:

  • Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., & Wirth, R. (2000). CRISP-DM 1.0 Step-by-step data mining guide.
  • Provost, F., & Fawcett, T. (2013). Data Science for Business: What you need to know about data mining and data-analytic thinking.