Personal View site logo
Make sure to join PV on Telegram or Facebook! Perfect to keep up with community on your smartphone.
Please, support PV!
It allows to keep PV going, with more focus towards AI, but keeping be one of the few truly independent places.
Sometimes algorithm matter a lot
  • Computational costs become a limiting factor when dealing with big systems. The exponential growth in the hardware performance observed over the past 60 years (the Moore’s law) is expected to end in the early 2020s. More advanced machine learning approaches (e.g., NNs) exhibit the cost scaling that grows polynomial with the dimension and the size of the statistics, rendering some form of ad hoc preprocessing and prereduction with more simple approaches (e.g., clustering methods) unavoidable for big data situations. However, these ad hoc preprocessing steps might impose a strong bias that is not easy to quantify. At the same time, lower cost of the method typically goes hand-in-hand with the lower quality of the obtained data representations (see Fig. 1). Since the amounts of collected data in most of the natural sciences are expected to continue their exponential growth in the near future, pressure on computational performance (quality) and scaling (cost) of algorithms will increase.

    image

    Instead of solving discretization, feature selection, and prediction problems separately, the introduced computational procedure (SPA) solves them simultaneously. The iteration complexity of SPA scales linearly with data size. The amount of communication between processors in the parallel implementation is independent of the data size and linear with the data dimension (Fig. 2), making it appropriate for big data applications. Hence, SPA did not require any form of data prereduction for any of the considered applications. As shown in the Fig. 1, having essentially the same iteration cost scaling as the very popular and computationally very cheap K-means algorithm (16, 17), SPA allows achieving substantially higher approximation quality and a much higher parallel speedup with the growing size T of the data.

    https://advances.sciencemag.org/content/6/5/eaaw0961

    sa12295.jpg
    800 x 485 - 101K