MORE@DIAG - Speakers: Tommaso Colombo; Ludovica Maccarrone
Speaker: Tommaso Colombo
Title: Recurrent Neural Networks: why do LSTM networks perform so well in time series prediction?
(Joint work with: Alberto De Santis, Stefano Lucidi)
Abstract:
Long Short-Term Memory (LSTM) networks are a broadly-used and very well-known variant of Recurrent Neural Networks, heavily employed, e.g., in time series forecasting and natural language processing. Deep learning has been revolutionary in many fields for the last two decades (e.g., image recognition, natural language processing, ...) thanks to the continuous increase in computing speed and capacity. To train deep learning machines, one needs to solve complex, highly nonlinear and large-scale optimization problems, but this notwithstanding few authors studied the theoretical properties of such problems. Based on their structure, Recurrent Neural Networks are naturally the most suitable to solve a time series forecasting problem, but their training leads to one of the most difficult optimization problems in Deep Learning. This difficulty is strictly tied to the vanishing gradients problem that arises when trying to latch information for a long time period.
There exist two main approaches to overcome these training issues: a structural one (e.g., LSTM and other memory-based networks) and an algorithmic one (e.g., gradient truncation, …). The most effective, recent approaches usually employ a combination of the two, i.e. a structure tailored to the problem and an optimization algorithm tailored to the structure at hand.
--------------------------------------
Speaker: Ludovica Maccarrone
Title: A new grey-box approach to solve the workforce scheduling problem in complex manufacturing and logistic contexts
(Joint work with Stefano Lucidi)
Abstract:
We present a new approach to solve the workforce scheduling problem in complex applicative contexts such as manufacturing and logistic processes.
We consider systems where one or more workloads require to be sequentially processed in different areas by different types of operators exclusively characterized by their skills. We assume the request of such skills is not fixed and may be varied in order to match the time/cost objectives of the organization. Furthermore, due to the complexity of the considered processes, we suppose it is not possible to derive an analytic expression linking the number of resources of different types working on an activity to the time to complete it. For this reason, a set of ad hoc simulators can be employed and their outputs are parameters of our formulation.
Typical issues arising in workforce management applications are related to the need of minimizing the labor cost while meeting deadlines and industrial plans. These resource/time trade-offs are even more complex under our assumptions due to the presence of simulators which natively split the problem into two sequential sub-problems. Our strategy addresses these difficulties through a decomposition approach which allows to model the problem as a grey-box optimization problem combining a new scheduling formulation with the simulation of some complex manufacturing and logistic processes.