REDUCING SIZE USING AUTOENCODER AND FORECASTING SALES DELAYS BASED ON (Q) SVR

Аслиддин  Саидкулов

https://ijmri.de/index.php/jmsi

volume 4, issue 5, 2025

334

REDUCING SIZE USING AUTOENCODER AND FORECASTING SALES DELAYS

BASED ON (Q) SVR

Asliddin Sayidqulov Xusniddin ugli

Samarkand State University after name Sharof Rashidov

Keywords:

autoencoder, SVR, QSVR, latent representation, size compression, regression,

price forecast, MAE, RMSE, R

2

.

Abstract

. This article examines the issue of identifying and forecasting delays in the

price and sales volume of retail products. In the proposed approach, size reduction using an

Autoencoder and compressed latent representations are used in the Support Vector Regression

(SVR) and Quantum Support Vector Regression (QSVR) models. Data on 51 types of products

were cleared and enriched based on time factors and categorical attributes over a 15-month

period. Latent layers were transmitted as inputs to regression models, and the results confirmed

that the QSVR model showed higher accuracy and lower error compared to SVR. The findings

show the practical effectiveness of the Autoencoder → (Q) SVR pipeline in predicting delays.

INTRODUCTION

Accurate forecasting of prices and sales volumes is of strategic importance for dealer and

retail companies. As a wide range of characteristics (time factors, agent territory, category,

holiday, etc.) increases, the complexity of calculations in classical models also increases.

Autoencoders can compress complex properties into a semantically rich latent space and extract

useful signals at the next regression stage. At the same time, quantum approaches, such as QSVR,

can model nonlinear dependencies with high expressiveness. In this work, the effectiveness of

reducing size with Autoencoder and using compressed representations together with SVR and

QSVR in real sales data is experimentally demonstrated. Numerous studies aimed at determining

price and sales volumes show the effectiveness of machine learning methods in high-precision

forecasting. In 2017, Zong estimated the residual value of articulated trucks using multiple

regression models. In 2018, Chiteri conducted an analysis based on auction and resale data on

trucks.

In 2021, Milosevic developed an approach based on ensemble regression models to

predict the cost of more than 500,000 pieces of construction equipment placed in the US market.

Similarly, in 2021, Shehadeh and Alshboul determined the residual value of six types of

construction equipment using various regression methods based on open auction data.

In 2023, Stuhler et al. compared seven advanced ML models and three AutoML

approaches for 10 different Caterpillar techniques based on 2,910 records obtained from a real

online trading platform and evaluated the resulting effectiveness.

These studies confirm the relevance and practical effectiveness of machine learning in

price and sales forecasting.

Quantum machine learning

Combining machine learning with quantum computing is becoming increasingly relevant

in order to achieve an advantage over classical methods. It is the parallel computing capabilities

of quantum computers that create wide opportunities for accelerating and simplifying machine

learning tasks.

https://ijmri.de/index.php/jmsi

volume 4, issue 5, 2025

335

Quantum machine learning (QML) is a field specializing in the application of quantum

algorithms to classical ML problems, and approaches developed on the basis of reference vector

machines (BVM) play an important role in this area.

Although significant advantages in QML have been proven mainly on the basis of

algorithms implemented on error-resistant quantum computers, application programs still remain

within the framework of medium-scale quantum computers.

In this article, we chose the quantum reference vector machines (QBVM) approach due to

the following key factors:



It has been theoretically proven that QSVMs work exponentially faster than classical

algorithms in some computational problems.



SVM models are deeply studied from a mathematical point of view, for which accuracy,

stability, convergence, and error limits are precisely assessed.



QSVMs are based on surface quantum circuits suitable for medium-scale quantum

computing systems, which makes them more practical.

The aim of the research is to increase the accuracy of predictions using quantum nuclear

techniques and to empirically assess their advantages over classical models.

Data cleanup and preparation

Duplicate records were identified by iterative comparison of a combination of different

properties and removed from the dataset. Measurements of outliers (outlier) and sales volume

were normalized using the Min-Max normalization method using formula (1).

min

max

min

new

x x

x

-

=

-

The processing of missing values was carried out depending on the type of attributes.

Also, the dataset was enriched by extracting calendar properties from the date attribute. More

precisely, such columns as

day (day)

,

month (month)

,

year (year)

, and

week number (week)

were separated from the sales date, which made it possible to take into account seasonality and

changes occurring over time in the model.

Based on these new columns, the set of properties was improved and brought to the state

of the

main set

, that is, a structure was formed that covers the most important attributes related

to time and category. This approach served to increase the accuracy of forecasting by making

each model sensitive to time components. Table 2 shows a fragment of the data structure formed

on the basis of these changes.

Table 2

. Updated attribute structure based on sales data

day

mont

h

yea

r

product

_id

agent_

id

catego

ry

week_na

me

holid

ay

unit_pri

ce

quantity_s

old

0.96

66

0.545

4

0.5 0.3454

0.7733 0

0.1667

0

0.0038

0.0039

0.96

66

0.272

7

1

0.8

1

0

0.3333

0

0.0884

0.0883

0.13

33

0.090

9

1

0.4363

0.6

0

0.3333

0

0.0001

0.0019

0.7

1

0.5 0.0909

0.5333 0

1

0

0.0182

0.0221

0.06

66

0.727

2

0.5 0.4363

0.5733 0

0.1666

0

0.0002

0.0080

0.66

66

0

1

0.4363

0.6

0

0.1666

0

0.0001

0.0019

0.56

66

0.545

4

0.5 0.0909

0.5333 0

0.5

0

0.0023

0.0029

0.16

67

0.636

3

0.5 0.0909

0.6533 0

0.1666

0

0.0020

0.0015

https://ijmri.de/index.php/jmsi

volume 4, issue 5, 2025

336

0.73

33

0

0.5 0.2909

0.5333 0

0.1666

0

0.0070

0.0059

0.5

0

1

0.5272

0.6

0

0.5

0

0.0017

0.0015

Auto encoder-based dimensional compression and controlled prediction

In this study, the Autoencoder architecture was adapted not only for data size

compression, but also for predicting the target variable based on controlled learning. Unlike

traditional Autoencoders, in this approach, instead of a decoder layer, a regression module is

placed as an output. As a result, the latent representations created by the encoder were directly

transferred to the prediction model. (Fig. 4)

The general structure of the autoencoder model is as follows:



Encoder

: projects input attributes into a compressed, low-dimensional latent space;



Latent layer

: represents compressed, information-dense representation;



Regressor

: A layer that projects data from a latent space onto a target variable.

The attributes used in the model consist of several combinations, each of which

constantly contains such basic properties as "quantity_sold" and "unit_price." These attributes

are defined as the main set, and various combinations are formed by adding other parameters (for

example, holiday, agent_id, week_name, category).

In the process of processing data combinations, 10 was defined as the maximum size of

the latent space. Accordingly, the following rules were applied:



If the number of input attributes is greater than 10, Autoencoder compresses them into a

10-dimensional latent space;



If the number of input attributes is 10 or less, the latent space size is set equal to the input

size.

With this approach, comparable and consistent compressed representations were created for

each combination of attributes. These representations in the latent space were transmitted as

input data for subsequent regression models.

Training of the autoencoder model was carried out on the basis of the following technical

configurations:



Optimizer algorithm: Adam



Loss function: Mean Squared Error (MSE)



Encoder activation: ReLU



Regressor activation: Linear

The regression problem based on compressed latent values was solved in two different

models:

1. Quantum approach - implemented using the Quantum Variational Regressor (QVR)

model developed on the Qiskit platform. The COBYLA algorithm was used to optimize

the model parameters. The latent vectors obtained as a result of the autoencoder were

transmitted encoded to the quantum period, and the regression results were obtained

based on this quantum architecture.

2. Classical approach - a Support Vector Regression (SVR) model was built based on

compressed data. This model served to effectively predict the high-dimensional input

space through nuclear projections.

Thanks to this approach, Autoencoder served not only as a means of reducing dimensionality,

but also to improve the overall performance of quantum and classical prediction models by

creating semantically meaningful latent representations.

https://ijmri.de/index.php/jmsi

volume 4, issue 5, 2025

337

Figure 4

. A typical autoencoder consists of two deep neural networks, each consisting of

several dense layers.

Performance Metric

The RMSE assessment indicator is one of the widely used indicators for assessing the

accuracy of regression models. It mainly determines the difference between the values of the

target variable obtained using forecasts in a set of economically significant data. The form of the

RMSE equation is as follows.

2

1

(

)

n

i

RMSE

Y Y

n

=

-

In this formula, n is the number of data in the dataset,

i

y

- the variable representing the

actual value of the target variable, represents the actual value of the i-th row,

i

y

€

- represents the

forecast values of the target variable.

The MAE evaluation indicator is a convenient indicator for assessing the accuracy of

machine learning algorithms. This indicator determines the absolute difference between the

forecasted main values and the actual initial values of the target variable in the dataset. The MAE

valuation indicator is expressed by the following formula:

1

n

i

MAE

Y Y

n

=

-

The evaluation indicator MAE is calculated by calculating the absolute value of the

difference between the projected and actual given values and averaging them. This method

mainly calculates the average magnitude of sharp or low-grade errors without considering the

direction of errors.

The evaluation indicator

is also widely known as the coefficient of determination, the

evaluation indicator

is an effective indicator for determining the effectiveness of regression

models. The method mainly allows measuring the stability of the algorithm by comparing the

variability of predicted values with the dynamic variability of the initial actual values of the

target variable. The formula for the R-square estimate indicator is as follows:

https://ijmri.de/index.php/jmsi

volume 4, issue 5, 2025

338

2

1

2

1

€

(

)

1

(

)

n

i

n

i

Y Y

R

Y Y

=

-

= -

-

%

This article analyzes the effectiveness of support vector regression (SVR and QSVR)

models built on the basis of classical and quantum approaches. All experiments were conducted

in the Python environment using Qiskit (IBM-Qiskit) quantum computing libraries. To ensure

the reliability of the results, the data set was checked according to a hold-out validation scheme,

divided into 80% training and 20% test sets.

Table 3 presents the main evaluation criteria for the SVR and QSVR models - mean

absolute error (MAE), mean square error (MSE), and coefficient of determination (R2). The

results show that the QSVR model showed an advantage over the classical SVR in all key

metrics.

Table 3. Prediction results for the SVR and QSVR models (according to the test set).

Model

MAE

MSE

R

2

SVR

0.120

0.410

0.674

QSVR

0.077

0.069

0.780

As can be seen from the table above, in the QSVR model, the values of MAE and MSE are low,

and the R

2

indicator is high, i.e., the accuracy of the quantum model in predicting information

ishigher compared to the classical model.

CONCLUSION

This study evaluated the effectiveness of using size reduction and classical and quantum

reference vector regression models using Autoencoder to predict sales delays based on retail data.

Regression models built on the basis of latent representations of the autoencoder showed higher

accuracy compared to traditional approaches. The experimental results confirmed that the QSVR

model recorded lower results in terms of MAE and RMSE indicators, and higher results in terms

of the R2 indicator, compared to SVR. Also, the use of compressed latent representations

significantly increased the overall predictability of the models. These results show that the

Autoencoder → (Q) SVR pipeline has practical significance in detecting sales delays and

predicting possible future disruptions.

REFERENCES

1. Zong, C., et al. (2017).

Residual value prediction of articulated trucks using regression

models.

Journal of Transportation Engineering.

2. Chiteri, A. (2018).

Auction-based valuation of used trucks using resale data.

Applied

Economics.

3. Milošević, D. (2021).

Ensemble regression models for price prediction of construction

equipment.

International Journal of Forecasting.

4. Shehadeh, H., & Alshboul, M. (2021).

Predicting residual value of construction machinery

using regression analysis.

Journal of Construction Engineering and Management.

5. Stühler, T., et al. (2023).

Comparative analysis of AutoML and ML models for online retail

equipment sales forecasting.

Machine Learning Applications.

https://ijmri.de/index.php/jmsi

volume 4, issue 5, 2025

339

6. Schuld, M., Sinayskiy, I., & Petruccione, F. (2015

). An introduction to quantum machine

learning.

Contemporary Physics.

7. Havlíček, V., et al. (2019).

Supervised learning with quantum-enhanced feature spaces.

Nature.

8. Biamonte, J., et al. (2017).

Quantum machine learning.

Nature.

9. Goodfellow, I., et al. (2016).

Deep Learning.

MIT Press.

10. Vapnik, V. (1995).

The Nature of Statistical Learning Theory.

Springer.

REDUCING SIZE USING AUTOENCODER AND FORECASTING SALES DELAYS BASED ON (Q) SVR

Аннотация

Скачивания

Ключевые слова:

Аннотация

Похожие статьи

Библиографические ссылки