+

EP4369679B1 - Data analytics on measurement data - Google Patents

Data analytics on measurement data Download PDF

Info

Publication number
EP4369679B1
EP4369679B1 EP22206704.3A EP22206704A EP4369679B1 EP 4369679 B1 EP4369679 B1 EP 4369679B1 EP 22206704 A EP22206704 A EP 22206704A EP 4369679 B1 EP4369679 B1 EP 4369679B1
Authority
EP
European Patent Office
Prior art keywords
values
value
time series
numerical value
measurement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP22206704.3A
Other languages
German (de)
French (fr)
Other versions
EP4369679A1 (en
Inventor
Lajos Bajzik
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Solutions and Networks Oy
Original Assignee
Nokia Solutions and Networks Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Solutions and Networks Oy filed Critical Nokia Solutions and Networks Oy
Priority to EP22206704.3A priority Critical patent/EP4369679B1/en
Priority to US18/483,213 priority patent/US12335127B2/en
Publication of EP4369679A1 publication Critical patent/EP4369679A1/en
Application granted granted Critical
Publication of EP4369679B1 publication Critical patent/EP4369679B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • H04L43/106Active monitoring, e.g. heartbeat, ping or trace-route using time related information in packets, e.g. by adding timestamps
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/067Generation of reports using time frame reporting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring

Definitions

  • Various example embodiments relate generally to a method and apparatus for performing data analytics on measurement data.
  • Data analytics may be performed on time series of measurement data, such as for example multi-variate performance management (PM) data time series.
  • the input of such data analytics is historically collected measurement data that is available for a specific time period and one or more measured network entities (e.g. RAN, Radio Access Network, cells) in a communication network.
  • Measurement data may contain a separate time series of measured values per network entity and / or and per measurement parameter.
  • a time series of values may include values of several measurement parameters for one or more network entities.
  • the network entities may have heterogeneous configuration. For example, in a RAN case, if the measurement data covers a large number of RAN cells, then it is likely that some cells will have certain radio features and / or functionalities enabled in their configuration, while others not.
  • the measurement data per network entity will most likely contain some measurements which are specific to a certain functionality. These make sense for a specific network entity only in periods of time when the given functionality is enabled in the network entity's configuration.
  • CA Carrier Aggregation
  • conditional measurements as the measurement data make sense, i.e. are available at a given timestamp only on the condition that the network entity's configuration at that timestamp supports the measurement such that a measured value is available for the measurement parameter at the given timestamp.
  • the time periods during which no measured value is available are referred to herein as the “unsupported (measurement) periods” and the measurements performed during these "unsupported periods” are referred herein as the "unsupported measurements”.
  • the collected measurement data must contain all measured values for all timestamps and one or more network entities, even for unsupported periods during which no measured value is available because the network entity's configuration is not supporting the given conditional measurement. Also the data repository of the network operator are configured to store measured values even during the unsupported periods.
  • the first (explicit) way is to store a specific value (e.g. NULL value) during the unsupported periods: this specific value is not a valid measured value for the measurement parameter and can be distinguished from any other measured value, but this specific value is an explicit indication that the measurement was not supported at the timestamp.
  • a specific value e.g. NULL value
  • the second (inaccurate) way is to replace the measured value with a specific numerical value that is a valid value for the measurement parameter, but this specific numerical value cannot be distinguished from a "true" measured value obtained outside an unsupported period.
  • This specific numerical value is referred herein to as a "special value” or "special numerical value” for a conditional measurement.
  • Such a special value can be a value which is in the range of valid measured values (e.g. value 0 for the CA throughput), or can be a value that is a valid value for the measurement parameter, but not in the range of valid measured values (e.g. -1 for CA throughput).
  • One or more example embodiments describe methods for processing time series of measured values obtained for a measurement parameter for respective timestamps and for each of one or more network entities.
  • the method is scalable to any measurement data set.
  • the method is designed to be scalable to any number of network entities, any number of measurement parameters and any number of measured values in the measurement time period per network entity.
  • a data provider 150, 160, 170 may be any network device or network function that is configured to generate data (e.g. measurement data) and to provide the generated data to at least one data consumer 190.
  • the data provider #1 150 is configured to generate several time series 151, 152, 153 of measured values for a measurement parameter (e.g. a cell throughput) concerning a first measured network object (e.g. a first radio cell).
  • the data provider #2 160 is configured to generate several time series 161, 162, 163 of measured values for the same measurement parameter concerning a second measured network object (e.g. a second radio cell).
  • the data provider #3 170 is configured to generate several time series 171, 172, 173 of measured values for the same measurement parameter concerning a third measured network object (e.g. a third radio cell).
  • the data consumer 190 may be any network device or network function that is configured to collect data (e.g. measurement data) from one or more data providers 150, 160, 170.
  • the data consumer 190 may be configured to store the collected data in a database 180.
  • the data consumer 190 may be configured to perform data analytics on the collected data and generate data analytics results 195.
  • a time series of values for a network entity and a measurement parameter includes values of the measurement parameter may be obtained for respective timestamps (e.g. evenly spaced timestamps corresponding to time steps) inside a measurement time period (e.g. historical time period). There may be one timestamp for each measurement interval inside the measurement time period.
  • the measurement parameter may concern a network entity, also referred to herein as the "measured object” or “measured entity” or “measured network entity”.
  • a measured network entity may correspond to various entities: a physical device (e.g. a base station, a user equipment, a router, a gateway, a controller, etc) in a communication network, a communication medium in a communication network (e.g. a radio channel or radio subchannel, a frequency band, etc), a radio cell in a communication network, a functionality in a communication network, etc.
  • the number of distinct measured objects in the time series of values may be high, for example tens of thousands in a RAN cell case.
  • the measurement interval between two values used in typical cases may range for example from one hour to five minutes, while the total historic time period may range for example from several months to one day or one hour.
  • FIG. 2 shows a simplified example of a time series 200 of values, where each value is represented by a box.
  • the different measured values are marked with different patterns at the measurement timestamps.
  • This example times series 200 includes 20 values for corresponding timestamps. Each value may be equal to v1, v2, v3, v4, v5 or v6 as represented by FIG. 2 .
  • the time series may include sequences of values at consecutive timestamps (e.g. corresponding to time steps) during which the measured value remains the same. These sequences are referred to herein as same-value sequences.
  • Each same value-sequence has a length in number of timestamps, which can be 1 or larger, and a value, which corresponds to the unchanged measured value during the sequence.
  • the value v1 is repeated 7 times and therefore the length of this same-value sequence of value v1 is equal to 7.
  • a sliding time window 210 may be applied to the time series of values 200 to analyze the values within the sliding time window 210, for example to detect a number of changes of values within the sliding time window 210.
  • the sliding time window 210 has a length of 7 (it includes 7 values) and at the position represented in the figure, 4 changes of values occur within the sliding time window 210.
  • FIG. 3 shows a flowchart of a method for processing time series of measured values according to an example.
  • the steps of the method may be implemented by an apparatus configured to implement a data consumer according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • step 300 a time series of values of a measurement parameter for respective timestamps is obtained, for each of one or more network entities.
  • Each time series of values may include measured values and a special numerical value at one or more timestamps.
  • the special numerical value is used in the time series at a given timestamp for replacing a value of the measurement parameter when no measured value is available for the measurement parameter at the given timestamp.
  • step 310 the time series of values are parsed to determine which numerical value in the time series of values corresponds to the special numerical value.
  • the parsing may be based on the verification of one or more conditions.
  • the one or more conditions may include at least a first condition #1 and a second condition #2.
  • the first condition #1 to be verified during the parsing may be based on the detection of same-values sequences having a minimum length L in the time series of values.
  • the first condition #1 may be verified if the set S of values for which same-values sequences having the minimum length L are detected include only one value. In this case, the sole value v0 in the set of values is identified as being the special value v0.
  • the second condition #2 to be verified during the parsing may be based on a count of value changes occurring in a sliding time window of a given length W ch applied to the time series of values.
  • the second condition #2 may be verified if the count of value changes occurring in a sliding time window is above a threshold N Ch for at least one temporal position of the sliding time window. This means that there exists at least one time window of length W ch in the time series obtained for the network entities, in which the measurement changes value frequently enough, at least N ch times.
  • a third condition #3 may be verified during the parsing. By using three conditions #1, #2, #3 together, one can make very likely that a given detected measurements are conditional measurements using a special value when a measured value is not available.
  • the third condition #3 may be based on a ratio q(v0) of the number N(v0) of special values v0 in one or more time series of values obtained respectively for the one or more network entities that are equal to the special value found based on the first condition over the total number N(v) of measured values in these time series.
  • step 320 based on the result of the parsing step 310, flags are assigned to the values in the time series of values.
  • a flag assigned to a value obtained at a given timestamp indicates whether the measurement was or not available at the given timestamp in the time series of values.
  • the method may comprise: determining whether the special value is a value out of a normal range of values in which the measured values fall or in the normal range of values.
  • This determination may be is based on a comparison between a first count C0 of same-values sequences with the special value in time series of values obtained for the one or more network entities and a second count CS of same-values sequences with the special value in time series of values obtained for one or more network entities that are shorter than a threshold L.
  • a flag assigned to a measured value is equal to a first flag value if the concerned measured value is equal to the special value and a second flag value otherwise.
  • the method may comprise: using a statistical distribution of the lengths of same-values sequences with the special value that are shorter than a threshold to detect that a same-value sequence with the special value has a length that is an outlier in the statistical distribution.
  • a flag corresponding to a given timestamp takes a first flag value (e.g. the first flag value is 1) for each value in the time series of values that is equal to the special value when the length of the same-values sequence including the concerned value at the given timestamp is an outlier in the statistical distribution and a second flag value (e.g. the second flag value is 0) otherwise.
  • first flag value e.g. the first flag value is 1
  • second flag value e.g. the second flag value is 0
  • Using the statistical distribution to detect an outlier may be performed using a classification algorithm to detect that the length value is an outlier in the statistical distribution of length values.
  • step 330 data analytics may be performed on the time series of values based on the flags to generate data analytics results.
  • Data analytics tasks may be categorized into two broad classes based on the length of the analyzed time period.
  • there are offline or batch data analytics tasks when the data analytics is done for data collected for a long historical time period of one or several months, typically with measurement interval of 1 hour.
  • the historical time period is one or a few days long, and the measurement interval is usually below 1 hour.
  • one or more operations may be performed on one or more network devices and / or network functions based on the data analytics results.
  • the operation may depend on the context and / or a scenario and / or network environment and / or the type of measurement parameter be monitored.
  • the one or more operations may include at least one of a configuration operation, a resource management operation, a monitoring operation, a channel estimation, an optimization operation, a repair operation, a maintenance operation, a restart, a reboot, a software update, a signaling operation, etc.
  • FIG. 4A shows a flowchart of a method for performing data analytics according to an example.
  • the steps of the method may be implemented by an apparatus according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • Time series 451, 452, 453 of measured values are stored within a database and provided as input to the method.
  • the time series 451, 452, 453 include measured values per measurement parameter and per each network entity for a given analyzed time period.
  • step 410 the time series 451, 452, 453 are parsed to detect special values in the measured values. This parsing may be performed as disclosed for example by reference to FIG. 3 (step 310) and / or FIG. 5 .
  • a table 415 of unsupported measurements may be generated for the measured values that are equal to the detected special value.
  • the table 415 may include one row per measured value.
  • One row may include the name of the measurement parameter, the associated timestamp of the measured value and the detected special value.
  • step 420 flags are assigned respectively to the values in the time series 451, 452, 453 (one flag per value). This assignment may be performed as disclosed for example by reference to FIG. 3 (step 320) and / or FIG. 6 .
  • step 420 series of flags 425 corresponding respectively to values in the time series of values are generated (one flag per measured value).
  • a flag may be a binary value.
  • the flag may be equal to a first flag value (e.g. 1) if the measurement is an unsupported measurement and is equal to a second flag value (e.g. 0) otherwise.
  • a flag is indicative that at the given timestamp no measurement was available (e.g. the functionality required for the measurement was most likely not enabled in the given network entity's configuration).
  • the flag value for the same measurement in the per-timestamp profile vectors of a network entity can be 1 in certain sub-periods of the whole historical data time period and 0 in others.
  • the flag value is set to 1, the measured value for the network entity is not to be interpreted as a real measurement.
  • step 430A data analytics are performed on the time series 451, 452, 453, independently of the knowledge of the presence of special values in these time series.
  • step 430A data analytics results 435A are generated.
  • the data analytics (step 430A) is executed for the whole time series and the steps 410 and 420 may be executed in parallel with the data analytics (step 430A).
  • step 440A the data analytics results 435A are analyzed together with the series of flags generated in step 420.
  • the analysis may include interpretation and / or Root Cause Analysis (RCA) of the data analytics results 435A by a human expert.
  • the analysis may include any other analysis task performed by a human expert and / or by an analysis software.
  • the human expert may interpret the data analytics results 435A by using the table 415 of unsupported measurements and/or the series of flags 425 assigned to the measured values.
  • FIG. 4B shows a flowchart of a method for performing data analytics according to an example.
  • the steps of the method may be implemented by an apparatus according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • the method of FIG. 4B is a variant of the method of FIG. 4A and the steps 410, 420 are the same in both methods.
  • step 430B data analytics are performed on the time series 451, 452, 453, based on the series of flags, i.e. with the knowledge of the presence of special values in these time series.
  • data analytics results 435B are generated.
  • the time series of values may be split into partitions, such that the per network entity per timestamp measured values in a partition have the flag values assigned to them. Then the data analytics is executed independently per each partition.
  • step 440B the data analytics results 435B are analyzed.
  • the analysis may include interpretation and / or Root Cause Analysis (RCA) of the data analytics results 435A by a human expert.
  • the analysis may include any other analysis task performed by a human expert and / or by an analysis software.
  • the human expert may interpret the per-partition analytics result using the table 415 of unsupported measurements and/or the series of flags 425 assigned to the measured values.
  • the method allows to avoid misinterpretation of data analytics results. For example, if data analytics implements an unsupervised classification that automatically classifies the per timestamp per network entity measured values into a low number of classes. This can be seen as the learned set of possible entity states in which the network entities can be at a given time. In this example, we assume that some of the network entities were configured in certain sub-periods such that a given measurement's prerequisite was not enabled for them, thus the measurements were filled with the special numerical value for these sub-periods. If the unsupervised classification assigns all the measured values equal to the special numerical value to a separate class, this class is thus effectively learned as a separate entity state if the special numerical values are not detected in the measured data as disclosed herein. This entity state can be easily misinterpreted as some kind of performance issue, while it is in fact just a configuration state, or in case of strange special value selection, the PM statistics/symptoms of the entity state can be hard to understand for the expert.
  • FIG. 5 shows a flowchart of a method for detecting a special numerical value according to an example.
  • the steps of the method may be implemented by an apparatus according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • the method concerning the processing applied for one specific measurement parameter to the time series of measured values obtained for this measurement parameter may be applied for several parameters, by performing the steps independently for each measurement parameter.
  • step 500 the variables S and N are initialized.
  • S is the set of measurement values occurring in same-value sequences of length at least equal to L. S is initialized to the empty set.
  • N is the number of sliding windows of length W ch in which the number of value changes is at least N ch . N is initialized to zero.
  • Step 510 is performed for each network entity for which time series of measures values are obtained. Step 510 includes steps 511 and 512.
  • step 511 the set S is updated by adding to the set S the measured value of each same-value sequence in the time series whose length is at least L provided that this measured value is not yet in the set S.
  • step 512 the value of N is updated by incrementing N with the number of sliding windows of length W ch over the time series in which the number of changes is at least N ch for a given network entity.
  • Step 520 is performed when the steps 511 and 512 have been performed for all network entities for which the time series of measures values are obtained.
  • the first condition #1 is tested.
  • the first condition #1 is verified if the set S includes only one value, noted v0. If the first condition #1 is verified, step 530 is executed after step 520. Otherwise the method ends.
  • This condition #1 is based on the natural assumption, that if a given measurement is not available for a network entity due to its configuration, then this configuration state lasts for a longer time period, so the time periods with unchanged special value are most likely long.
  • step 540 the second condition #2 is tested.
  • the second condition #2 is verified if N > 0. If the second condition #2 is verified, step 550 is executed after step 540. Otherwise the method determines that the special value v0 found at step 530 is not used for unsupported measurements and the method ends.
  • condition #1 collects evidence, that one or more network entities have long periods when the measurement is not available (represented with the same special value in their time series)
  • condition #2 collects evidence that there one or more periods for some network entity or network entities when the measurement is available and measured normally as indicated by frequent value changes. Checking this condition #2 is done because unsupported measurement means not only that the measurement is not available for network entities and periods when their configuration lacks the required feature, but also that it is available for other network entities and/or other periods.
  • the adjustment of the values of W ch and N ch may be performed in different manners. However, if the value of L is selected to span one day like discussed above, then W ch could be equal to L and N ch be selected such that N ch > W ch / L min , where the L min is a period length selected such that the configuration of an network entity remains the same during L min with high probability. This selection of N ch prevents the misinterpretation of value changes between special value and normal measured value / values which may happen due to configuration changes as normal changes between normal measured values.
  • step 550 q(v0) is computed over all measured entities.
  • q(v0) is the relative frequency of the candidate special value v0 over all measurements.
  • q(v0) gives what fraction of all the measurements in the time series of all network entities takes the special value candidate special value v0.
  • step 560 the third condition #3 is tested.
  • the third condition #3 is verified if q(v0) ⁇ Q th . If the third condition #3 is verified, step 570 is executed after step 560. Otherwise the method it is determined that the measurements are not conditional measurements and the method ends.
  • the third condition #3 may not be used but it improves the method in specific cases because there are often measurement parameters that most of the time take the same value, even when they are available and measured normally. For example counters of very seldom error events, that most of the time take the value of 0. Based on the first two conditions these type of measurements can be very easily mistaken as unsupported measurements with the special value being their usual value (0 for the error counters example) even when they are available and measured normally for all network entities and over the whole measurement period. As a trade-off, to avoid false positives, these situations are detected by using this third condition #3.
  • the value of Q th can be selected to some value slightly lower than 1, for example between 0.9 and 1 or between 0.99 and 1.
  • step 570 it is determined that the measurements are conditional measurements using the special value v0 when a measured value is not available.
  • the method is biased toward increasing the reliability that a measurement detected as an unsupported measurement by the method is indeed an unsupported measurement, at the expense of potentially missing the detection of some measurements that are unsupported measurements in reality.
  • conditional measurement can fall into two categories with respect to its special value:
  • the special value is a valid value of the measurement parameter, e.g. a valid floating point or integer value.
  • the determination whether the measured value was available at a given timestamp for a given network entity is straightforward: if the measured value is equal to the special value, then the measured value was not available (e.g. the corresponding flag value may be equal to 1), otherwise it was available (e.g. the flag value may be equal to is 0).
  • a measured value equal to the special value does not mean necessarily that the measured value was not available at the timestamp.
  • the flag value can be still both 0 or 1, while at timestamps where it is not equal to the special value the profile vector element is surely 0.
  • FIG. 6 shows a flowchart of a method for assigning flags to measured values according to an example.
  • the steps of the method may be implemented by an apparatus according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • the same threshold L applied to lengths of same-values sequences is used as for the special value detection step (see FIG. 5 and the corresponding description).
  • the method evaluates of a heuristic condition (see step 610) for inferring which of the two categories described above the time series of measured values belongs to.
  • the heuristic condition is based on the observation, that in case of unsupported measurement that falls in category a) (for which the special value is not a normal measured value) there are two possible cases when a same-value sequence with the special value is shorter than the L parameter (what has been used for the special value detection step):
  • step 600 the values of the variables C0 and CS are determined.
  • C0 is the total count of same-value sequences with the special value v0 over all time series of measured values obtained for all network entities during a time period.
  • CS counts the same-value sequences with the special value v0 over all time series of measured values obtained for all network entities that are shorter than L and corresponds to an unsupported measurement period that is contained entirely in the whole measurement period.
  • These same-value sequences correspond to unsupported measurement period(s) that is (are) contained entirely in the whole measurement period, but has (have) a length shorter than the L parameter, as defined in case #2) above.
  • the same-value sequences with the special value v0, if any, that starts right at the first measurement time step may then be excluded for the determination of CS.
  • the same-value sequences with the special value v0, if any, that ends right at the last measurement time step may be excluded for the determination of CS.
  • step 610 it is assumed that for an unsupported measurement in category a) and a reasonably good selection of L, the relative fraction CS/CO is low. Hence, it is determined that if CS/CO ⁇ P th , then the unsupported measurement belongs to category a), (see step 620) and otherwise the unsupported measurement belongs to category b) (see steps 630-650).
  • P th should be set to a sufficiently small value, for example between 0 and 0.1 or between 0 and 0.01.
  • step 620 the flags are assigned to the measured values in the time series obtained for the network entities.
  • steps 630-650 a statistical approach is used based on the statistical distribution of the lengths of a same-value sequences with the special value.
  • the length of a same-value sequence with the special value may be used decide whether the same-value sequence is a sequence (referred to as a "not normal” sequence) for which the measurement was not available, or a sequence (referred to as a "normal” sequence) when the special value was the result of normal measurements taking the special value.
  • step 630 a statistical distribution of the lengths of same-values sequences of the special value v0 that are shorter than L (the same-values sequences taken into account for the computation of CS) is generated.
  • the statistical distribution may be determined on the same-value sequences whose length is shorter than L, assuming that most of these sequences are "normal" sequences.
  • step 640 the statistical distribution is used to determine whether the length of a given same-value sequence with the special value v0 is an outlier in the statistical distribution.
  • outlier means that the length is outstandingly long: the outlier may thus be seen as a high outlier.
  • the method may be configured to estimate the statistical distribution of the lengths of the "normal” sequences and then detect "not normal” sequences whose length is an outlier according to this statistical distribution.
  • the detection of the "normal" sequence length statistical distribution and/or the outlier detection may be performed using various algorithms, e.g. a classification algorithm. It can be based on the basic parameters of the statistical distribution (e.g. mean and/or standard deviation) and then using these parameters to detect the outlier values in the statistical distribution, for example by detecting the length values that fall below a threshold computed based on the mean and standard deviation. It can be based on machine learning algorithms. It can be based for example on a classification method like the random forest.
  • a classification algorithm e.g. a classification algorithm. It can be based on the basic parameters of the statistical distribution (e.g. mean and/or standard deviation) and then using these parameters to detect the outlier values in the statistical distribution, for example by detecting the length values that fall below a threshold computed based on the mean and standard deviation. It can be based on machine learning algorithms. It can be based for example on a classification method like the random forest.
  • step 650 the flags are assigned to the measured values in the time series obtained for the network entities.
  • FIG. 7 shows a statistical distribution of the lengths of same-value sequences with the special numerical value v0 according to an example.
  • the x axis is the sequence length in number of measurement timestamps.
  • the y axis is the same-value sequence count.
  • the length values in the set 720 are outliers in the statistical distribution.
  • one can detect the outliers by taking the sequence lengths which are below L 24: this does not introduce much error, because the number of "outlier" sequence length values below L is proportionally very small compared to the total number of sequence length values above L.
  • a process may be terminated when its operations are completed but may also have additional steps not disclosed in the figure or description.
  • a process may correspond to a method, function, procedure, subroutine, subprogram, etc.
  • a process corresponds to a function
  • its termination may correspond to a return of the function to the calling function or the main function.
  • instructions to perform the necessary tasks may be stored in a computer readable medium that may be or not included in a host apparatus or host system.
  • the instructions may be transmitted over the computer-readable medium and be loaded onto the host apparatus or host system.
  • the instructions are configured to cause the host apparatus or host system to perform one or more functions disclosed herein.
  • at least one memory may include or store instructions, the at least one memory and the instructions may be configured to, with at least one processor, cause the host apparatus or host system to perform the one or more functions.
  • the processor, memory and instructions serve as means for providing or causing performance by the host apparatus or host system of one or more functions disclosed herein.
  • the host apparatus or host system may be a general-purpose computer and / or computing system, a special purpose computer and / or computing system, a programmable processing apparatus and / or system, a machine, etc.
  • the host apparatus or host system may be or include or be part of: a user equipment, client device, mobile phone, laptop, computer, network element, data server, network resource controller, network apparatus, router, gateway, network node, computer, cloud-based server, web server, application server, proxy server, etc.
  • FIG. 8 illustrates an example embodiment of an apparatus 9000.
  • the apparatus may be configured to host at least one data consumer entity as disclosed herein.
  • the apparatus may be configured to perform one or several of the methods disclosed herein.
  • the apparatus 9000 may include at least one processor 9010 and at least one memory 9020.
  • the apparatus 9000 may include one or more communication interfaces 9040 (e.g. network interfaces for access to a wired / wireless network, including Ethernet interface, WIFI interface, etc) connected to the processor and configured to communicate via wired / non wired communication link(s).
  • the apparatus 9000 may include user interfaces 9030 (e.g. keyboard, mouse, display screen, etc) connected with the processor.
  • the apparatus 9000 may further include one or more media drives 9050 for reading a computer-readable storage medium (e.g. digital storage disc 9060 (CD-ROM, DVD, Blue Ray, etc), USB key 9080, etc).
  • the processor 9010 is connected to each of the other components 9020, 9030, 9040, 9050 in order to control operation thereof.
  • the memory 9020 may include a random access memory (RAM), cache memory, non-volatile memory, backup memory (e.g., programmable or flash memories), read-only memory (ROM), a hard disk drive (HDD), a solid state drive (SSD) or any combination thereof.
  • RAM random access memory
  • non-volatile memory non-volatile memory
  • backup memory e.g., programmable or flash memories
  • ROM read-only memory
  • HDD hard disk drive
  • SSD solid state drive
  • the ROM of the memory 9020 may be configured to store, amongst other things, an operating system of the apparatus 9000 and / or one or more computer program code of one or more software applications.
  • the RAM of the memory 9020 may be used by the processor 9010 for the temporary storage of data.
  • the processor 9010 may be configured to store, read, load, execute and/or otherwise process instructions 9070 stored in a computer-readable storage medium 9060, 9080 and / or in the memory 9020 such that, when the instructions are executed by the processor, causes the apparatus 9000 to perform one or more or all steps of a method described herein for the concerned apparatus 9000.
  • the instructions may correspond to program instructions or computer program code.
  • the instructions may include one or more code segments.
  • a code segment may represent a procedure, function, subprogram, program, routine, subroutine, module, software package, class, or any combination of instructions, data structures or program statements.
  • a code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, arguments, parameters or memory contents.
  • Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable technique including memory sharing, message passing, token passing, network transmission, etc.
  • a processor or likewise a processing circuit may correspond to a digital signal processor (DSP), a network processor, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a System-on-Chips (SoC), a Central Processing Unit (CPU), an arithmetic logic unit (ALU), a programmable logic unit (PLU), a processing core, a programmable logic, a microprocessor, a controller, a microcontroller, a microcomputer, a quantum processor, any device capable of responding to and/or executing instructions in a defined manner and/or according to a defined logic. Other hardware, conventional or custom, may also be included.
  • a processor or processing circuit may be configured to execute instructions adapted for causing the host apparatus or host system to perform one or more functions disclosed herein for the host apparatus or host system.
  • a computer readable medium or computer readable storage medium may be any tangible storage medium suitable for storing instructions readable by a computer or a processor.
  • a computer readable medium may be more generally any storage medium capable of storing and/or containing and/or carrying instructions and/or data.
  • the computer readable medium may be a non-transitory computer readable medium.
  • the term "non-transitory”, as used herein, is a limitation of the medium itself (i.e., tangible, not a signal) as opposed to a limitation on data storage persistency (e.g., RAM vs. ROM).
  • a computer-readable medium may be a portable or fixed storage medium.
  • a computer readable medium may include one or more storage device like a permanent mass storage device, magnetic storage medium, optical storage medium, digital storage disc (CD-ROM, DVD, Blue Ray, etc), USB key or dongle or peripheral, a memory suitable for storing instructions readable by a computer or a processor.
  • a memory suitable for storing instructions readable by a computer or a processor may be for example: read only memory (ROM), a permanent mass storage device such as a disk drive, a hard disk drive (HDD), a solid state drive (SSD), a memory card, a core memory, a flash memory, or any combination thereof.
  • ROM read only memory
  • HDD hard disk drive
  • SSD solid state drive
  • memory card a memory card, a core memory, a flash memory, or any combination thereof.
  • the wording "means configured to perform one or more functions” or “means for performing one or more functions” may correspond to one or more functional blocks comprising circuitry that is adapted for performing or configured to perform the concerned function(s).
  • the block may perform itself this function or may cooperate and / or communicate with other one or more blocks to perform this function.
  • the "means” may correspond to or be implemented as "one or more modules", “one or more devices", “one or more units”, etc.
  • the means may include at least one processor and at least one memory including computer program code, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause an apparatus or system to perform the concerned function(s).
  • circuitry may refer to one or more or all of the following:
  • circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and/or firmware.
  • circuitry also covers, for example and if applicable to the particular claim element, an integrated circuit for a network element or network node or any other computing device or network device.
  • circuitry may cover digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), etc.
  • the circuitry may be or include, for example, hardware, programmable logic, a programmable processor that executes software or firmware, and/or any combination thereof (e.g. a processor, control unit/entity, controller) to execute instructions or software and control transmission and receptions of signals, and a memory to store data and/or instructions.
  • the circuitry may also make decisions or determinations, generate frames, packets or messages for transmission, decode received frames or messages for further processing, and other tasks or functions described herein.
  • the circuitry may control transmission of signals or messages over a radio network, and may control the reception of signals or messages, etc., via one or more communication networks.
  • first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and similarly, a second element could be termed a first element, without departing from the scope of this disclosure.
  • the term "and/or,” includes any and all combinations of one or more of the associated listed items.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • Health & Medical Sciences (AREA)
  • Pure & Applied Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Algebra (AREA)
  • Mathematical Optimization (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Description

    TECHNICAL FIELD
  • Various example embodiments relate generally to a method and apparatus for performing data analytics on measurement data.
  • BACKGROUND
  • Data analytics may be performed on time series of measurement data, such as for example multi-variate performance management (PM) data time series. The input of such data analytics is historically collected measurement data that is available for a specific time period and one or more measured network entities (e.g. RAN, Radio Access Network, cells) in a communication network. Measurement data may contain a separate time series of measured values per network entity and / or and per measurement parameter. Alternatively a time series of values may include values of several measurement parameters for one or more network entities.
  • Despite the network entities, whose measurement data constitute the input of a given data analytics task, are of the same kind, the network entities may have heterogeneous configuration. For example, in a RAN case, if the measurement data covers a large number of RAN cells, then it is likely that some cells will have certain radio features and / or functionalities enabled in their configuration, while others not.
  • Furthermore, the measurement data per network entity will most likely contain some measurements which are specific to a certain functionality. These make sense for a specific network entity only in periods of time when the given functionality is enabled in the network entity's configuration. For example the CA (Carrier Aggregation) throughput for a radio cell makes sense only when the radio cell is configured for CA.
  • This type of measurements are referred to herein as "conditional measurements", as the measurement data make sense, i.e. are available at a given timestamp only on the condition that the network entity's configuration at that timestamp supports the measurement such that a measured value is available for the measurement parameter at the given timestamp.
  • The time periods during which no measured value is available (e.g. because the network entity's configuration is not supporting the measurement for a given measurement parameter), are referred to herein as the "unsupported (measurement) periods" and the measurements performed during these "unsupported periods" are referred herein as the "unsupported measurements".
  • But the collected measurement data must contain all measured values for all timestamps and one or more network entities, even for unsupported periods during which no measured value is available because the network entity's configuration is not supporting the given conditional measurement. Also the data repository of the network operator are configured to store measured values even during the unsupported periods.
  • There are several ways of handling this in practice.
  • The first (explicit) way is to store a specific value (e.g. NULL value) during the unsupported periods: this specific value is not a valid measured value for the measurement parameter and can be distinguished from any other measured value, but this specific value is an explicit indication that the measurement was not supported at the timestamp.
  • The second (inaccurate) way is to replace the measured value with a specific numerical value that is a valid value for the measurement parameter, but this specific numerical value cannot be distinguished from a "true" measured value obtained outside an unsupported period. This specific numerical value is referred herein to as a "special value" or "special numerical value" for a conditional measurement. Such a special value can be a value which is in the range of valid measured values (e.g. value 0 for the CA throughput), or can be a value that is a valid value for the measurement parameter, but not in the range of valid measured values (e.g. -1 for CA throughput).
  • In these conditional measurement cases, when the time series of measured values are filled in with a special value at some timestamps, a data analytics functionality most often has no information on the presence of such special value in the numerical measured values. The data analytics functionality therefore cannot differentiate numerically between normal values, and special values filling in for conditional measurements at timestamps when the values of the measurement parameter are not available (due to configuration or any other measurement failure reasons), and it can result in misleading data analytics results. Also this makes root cause analysis (RCA) of the data analytics results harder.
    US2020125471A1 relates to machine-learning systems and methods for seasonal pattern detection and forecasting.
    US2010027432A1 discloses a method to generate impact scores based on observed network traffic.
    WO02021170238A1 relates to the generation and consumption of analytics in a mobile network, e.g., in 5th generation mobile or cellular communication (5G) systems (5GS) and networks.
  • SUMMARY
  • The scope of protection is set out by the independent claims. Dependent claims define preferred embodiments.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Example embodiments will become more fully understood from the detailed description given herein below and the accompanying drawings, which are given by way of illustration only and thus are not limiting of this disclosure.
    • FIG. 1 illustrates an example communication system according to an example.
    • FIG. 2 shows a simplified example of a time series of values according to an example.
    • FIG. 3 shows a flowchart of a method for processing measurement data according to an example.
    • FIG. 4A shows a flowchart of a method for performing data analytics according to an example.
    • FIG. 4B shows a flowchart of a method for performing data analytics according to an example.
    • FIG. 5 shows a flowchart of a method for detecting a special numerical value according to an example.
    • FIG. 6 shows a flowchart of a method for assigning flags to measured values according to an example.
    • FIG. 7 shows a statistical distribution of lengths of same-value sequences according to an example.
    • FIG. 8 illustrates an example embodiment of an apparatus 9000 according to an example.
  • It should be noted that these drawings are intended to illustrate various aspects of devices, methods and structures used in example embodiments described herein. The use of similar or identical reference numbers in the various drawings is intended to indicate the presence of a similar or identical element or feature.
  • DETAILED DESCRIPTION
  • Detailed example embodiments are disclosed herein. However, specific structural and/or functional details disclosed herein are merely representative for purposes of describing example embodiments and providing a clear understanding of the underlying principles.
  • One or more example embodiments describe methods for processing time series of measured values obtained for a measurement parameter for respective timestamps and for each of one or more network entities.
  • The method infers - solely from the numerical values in the time series of measured values - what is the special values used for the unsupported measurements, which measured values correspond to unsupported measurements, at which timestamps they were actually obtained and when the measured values were not available for the different network entities (e.g. due to configuration or potentially other reasons).
  • The method includes a first phase in which the special value used in the time series of values is detected and a second phase in which flags are assigned respectively to each value in a time series of values. A flag assigned to a measured value obtained at a given timestamp indicates whether the measurement was or not available at the given timestamp in the time series of values. Data analytics may then be performed on the time series of values and using the corresponding flags.
  • The method is scalable to any measurement data set. The method is designed to be scalable to any number of network entities, any number of measurement parameters and any number of measured values in the measurement time period per network entity.
  • This method is relevant practical problem for implementing data analytics algorithms and allows anomaly detection (AD) at data analytics stage.
  • FIG. 1 illustrates an example communication system 100 in which the method may be implemented. The communication system is configured to perform data collection through one or more communication networks.
  • Data collection in a communication network may be defined according to a service-oriented approach described as an interaction between a data consumer and a data provider. The data consumer requests data from the data provider when the data consumer needs data to perform a given task or on a subscription basis to receive data when they are available.
  • For illustrative purpose, the communication system 100 includes several network devices including data providers 150, 160, 170 and a data consumer 190.
  • A data provider 150, 160, 170 may be any network device or network function that is configured to generate data (e.g. measurement data) and to provide the generated data to at least one data consumer 190.
  • In the example of FIG. 1, the data provider #1 150 is configured to generate several time series 151, 152, 153 of measured values for a measurement parameter (e.g. a cell throughput) concerning a first measured network object (e.g. a first radio cell). Likewise, the data provider #2 160 is configured to generate several time series 161, 162, 163 of measured values for the same measurement parameter concerning a second measured network object (e.g. a second radio cell). Likewise, the data provider #3 170 is configured to generate several time series 171, 172, 173 of measured values for the same measurement parameter concerning a third measured network object (e.g. a third radio cell).
  • The data consumer 190 may be any network device or network function that is configured to collect data (e.g. measurement data) from one or more data providers 150, 160, 170. The data consumer 190 may be configured to store the collected data in a database 180. The data consumer 190 may be configured to perform data analytics on the collected data and generate data analytics results 195.
  • A time series of values for a network entity and a measurement parameter includes values of the measurement parameter may be obtained for respective timestamps (e.g. evenly spaced timestamps corresponding to time steps) inside a measurement time period (e.g. historical time period). There may be one timestamp for each measurement interval inside the measurement time period.
  • A measurement parameter may be any measurable quantity or counter that can be represented by a numerical value. The method is really agnostic to the type of parameter that is measured. For example, without limitation: a throughput, a channel quality, a bandwidth, a signal over noise ratio, a processing load, a power, a current, a voltage, a phase, an amplitude, a temperature, counters of higher layers of the communication network (e.g. packet loss counters, number of successful/failed UE connection attempts,...).etc etc. The measurement parameter may be measured directly by an appropriate sensor or a signal detector configured to detect a signal representative of the physical parameter. The measurement parameter may be derived based on detections performed by one or more sensors and / or one or more signal detectors.
  • The measurement parameter may concern a network entity, also referred to herein as the "measured object" or "measured entity" or "measured network entity". A measured network entity may correspond to various entities: a physical device (e.g. a base station, a user equipment, a router, a gateway, a controller, etc) in a communication network, a communication medium in a communication network (e.g. a radio channel or radio subchannel, a frequency band, etc), a radio cell in a communication network, a functionality in a communication network, etc.
  • The number of distinct measured objects in the time series of values may be high, for example tens of thousands in a RAN cell case. The measurement interval between two values used in typical cases may range for example from one hour to five minutes, while the total historic time period may range for example from several months to one day or one hour.
  • FIG. 2 shows a simplified example of a time series 200 of values, where each value is represented by a box. The different measured values are marked with different patterns at the measurement timestamps.
  • This example times series 200 includes 20 values for corresponding timestamps. Each value may be equal to v1, v2, v3, v4, v5 or v6 as represented by FIG. 2.
  • As shown in the figure, the time series may include sequences of values at consecutive timestamps (e.g. corresponding to time steps) during which the measured value remains the same. These sequences are referred to herein as same-value sequences. Each same value-sequence has a length in number of timestamps, which can be 1 or larger, and a value, which corresponds to the unchanged measured value during the sequence. In the example of FIG. 2, the value v1 is repeated 7 times and therefore the length of this same-value sequence of value v1 is equal to 7.
  • A sliding time window 210 may be applied to the time series of values 200 to analyze the values within the sliding time window 210, for example to detect a number of changes of values within the sliding time window 210. In the example of FIG. 2, the sliding time window 210 has a length of 7 (it includes 7 values) and at the position represented in the figure, 4 changes of values occur within the sliding time window 210.
  • FIG. 3 shows a flowchart of a method for processing time series of measured values according to an example.
  • The steps of the method may be implemented by an apparatus configured to implement a data consumer according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • In step 300, a time series of values of a measurement parameter for respective timestamps is obtained, for each of one or more network entities. Each time series of values may include measured values and a special numerical value at one or more timestamps. As explained herein, the special numerical value is used in the time series at a given timestamp for replacing a value of the measurement parameter when no measured value is available for the measurement parameter at the given timestamp.
  • In step 310, the time series of values are parsed to determine which numerical value in the time series of values corresponds to the special numerical value.
  • The parsing may be based on the verification of one or more conditions. The one or more conditions may include at least a first condition #1 and a second condition #2.
  • The first condition #1 to be verified during the parsing may be based on the detection of same-values sequences having a minimum length L in the time series of values. The first condition #1 may be verified if the set S of values for which same-values sequences having the minimum length L are detected include only one value. In this case, the sole value v0 in the set of values is identified as being the special value v0.
  • The second condition #2 to be verified during the parsing may be based on a count of value changes occurring in a sliding time window of a given length Wch applied to the time series of values. The second condition #2 may be verified if the count of value changes occurring in a sliding time window is above a threshold NCh for at least one temporal position of the sliding time window. This means that there exists at least one time window of length Wch in the time series obtained for the network entities, in which the measurement changes value frequently enough, at least Nch times.
  • A third condition #3 may be verified during the parsing. By using three conditions #1, #2, #3 together, one can make very likely that a given detected measurements are conditional measurements using a special value when a measured value is not available.
  • The third condition #3 may be based on a ratio q(v0) of the number N(v0) of special values v0 in one or more time series of values obtained respectively for the one or more network entities that are equal to the special value found based on the first condition over the total number N(v) of measured values in these time series. The ration q(v)= N(v0)/N(v) may be compared with a threshold Qth and the third condition is met if the ratio q(v0) is below the threshold Qth.
  • In step 320, based on the result of the parsing step 310, flags are assigned to the values in the time series of values. A flag assigned to a value obtained at a given timestamp indicates whether the measurement was or not available at the given timestamp in the time series of values.
  • For assignment a flag to a value, the method may comprise: determining whether the special value is a value out of a normal range of values in which the measured values fall or in the normal range of values.
  • This determination may be is based on a comparison between a first count C0 of same-values sequences with the special value in time series of values obtained for the one or more network entities and a second count CS of same-values sequences with the special value in time series of values obtained for one or more network entities that are shorter than a threshold L.
  • When the ratio between the second count CS and the first count C0 is below a threshold Pth, it is determined that the special value is a value out of the normal range of values. In this case, a flag assigned to a measured value is equal to a first flag value if the concerned measured value is equal to the special value and a second flag value otherwise.
  • When the ratio between the second count CS and the first count C0 is above the threshold Pth, it is determined that the special value is a value in the normal range of values. In this case, the method may comprise: using a statistical distribution of the lengths of same-values sequences with the special value that are shorter than a threshold to detect that a same-value sequence with the special value has a length that is an outlier in the statistical distribution.
  • A flag corresponding to a given timestamp takes a first flag value (e.g. the first flag value is 1) for each value in the time series of values that is equal to the special value when the length of the same-values sequence including the concerned value at the given timestamp is an outlier in the statistical distribution and a second flag value (e.g. the second flag value is 0) otherwise.
  • Using the statistical distribution to detect an outlier may be performed using a classification algorithm to detect that the length value is an outlier in the statistical distribution of length values.
  • In step 330, data analytics may be performed on the time series of values based on the flags to generate data analytics results.
  • Data analytics tasks may be categorized into two broad classes based on the length of the analyzed time period. On the one hand, there are offline or batch data analytics tasks, when the data analytics is done for data collected for a long historical time period of one or several months, typically with measurement interval of 1 hour. On the other hand, in online cases the historical time period is one or a few days long, and the measurement interval is usually below 1 hour.
  • In step 340, one or more operations may be performed on one or more network devices and / or network functions based on the data analytics results. The operation may depend on the context and / or a scenario and / or network environment and / or the type of measurement parameter be monitored. The one or more operations may include at least one of a configuration operation, a resource management operation, a monitoring operation, a channel estimation, an optimization operation, a repair operation, a maintenance operation, a restart, a reboot, a software update, a signaling operation, etc.
  • FIG. 4A shows a flowchart of a method for performing data analytics according to an example. The steps of the method may be implemented by an apparatus according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • Time series 451, 452, 453 of measured values are stored within a database and provided as input to the method. The time series 451, 452, 453 include measured values per measurement parameter and per each network entity for a given analyzed time period.
  • In step 410, the time series 451, 452, 453 are parsed to detect special values in the measured values. This parsing may be performed as disclosed for example by reference to FIG. 3 (step 310) and / or FIG. 5. As output of step 410, a table 415 of unsupported measurements may be generated for the measured values that are equal to the detected special value.
  • The table 415 may include one row per measured value. One row may include the name of the measurement parameter, the associated timestamp of the measured value and the detected special value.
  • In step 420, flags are assigned respectively to the values in the time series 451, 452, 453 (one flag per value). This assignment may be performed as disclosed for example by reference to FIG. 3 (step 320) and / or FIG. 6. As output of step 420, series of flags 425 corresponding respectively to values in the time series of values are generated (one flag per measured value).
  • A flag may be a binary value. The flag may be equal to a first flag value (e.g. 1) if the measurement is an unsupported measurement and is equal to a second flag value (e.g. 0) otherwise. A flag is indicative that at the given timestamp no measurement was available (e.g. the functionality required for the measurement was most likely not enabled in the given network entity's configuration).
  • It is possible that a network entity's configuration has been changed during the analyzed time period, even several times, so the flag value for the same measurement in the per-timestamp profile vectors of a network entity can be 1 in certain sub-periods of the whole historical data time period and 0 in others. When the flag value is set to 1, the measured value for the network entity is not to be interpreted as a real measurement.
  • In step 430A, data analytics are performed on the time series 451, 452, 453, independently of the knowledge of the presence of special values in these time series. As output of step 430A, data analytics results 435A are generated. In FIG. 4A, the data analytics (step 430A) is executed for the whole time series and the steps 410 and 420 may be executed in parallel with the data analytics (step 430A).
  • In step 440A, the data analytics results 435A are analyzed together with the series of flags generated in step 420. The analysis may include interpretation and / or Root Cause Analysis (RCA) of the data analytics results 435A by a human expert. The analysis may include any other analysis task performed by a human expert and / or by an analysis software. The human expert may interpret the data analytics results 435A by using the table 415 of unsupported measurements and/or the series of flags 425 assigned to the measured values.
  • FIG. 4B shows a flowchart of a method for performing data analytics according to an example.
  • The steps of the method may be implemented by an apparatus according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • The method of FIG. 4B is a variant of the method of FIG. 4A and the steps 410, 420 are the same in both methods.
  • In step 430B, data analytics are performed on the time series 451, 452, 453, based on the series of flags, i.e. with the knowledge of the presence of special values in these time series. As output of step 430B, data analytics results 435B are generated.
  • During the data analytics, the time series of values may be split into partitions, such that the per network entity per timestamp measured values in a partition have the flag values assigned to them. Then the data analytics is executed independently per each partition.
  • In step 440B, the data analytics results 435B are analyzed. Like for step 440A, the analysis may include interpretation and / or Root Cause Analysis (RCA) of the data analytics results 435A by a human expert. The analysis may include any other analysis task performed by a human expert and / or by an analysis software. Similarly to FIG. 4A, the human expert may interpret the per-partition analytics result using the table 415 of unsupported measurements and/or the series of flags 425 assigned to the measured values.
  • The method allows to avoid misinterpretation of data analytics results. For example, if data analytics implements an unsupervised classification that automatically classifies the per timestamp per network entity measured values into a low number of classes. This can be seen as the learned set of possible entity states in which the network entities can be at a given time. In this example, we assume that some of the network entities were configured in certain sub-periods such that a given measurement's prerequisite was not enabled for them, thus the measurements were filled with the special numerical value for these sub-periods. If the unsupervised classification assigns all the measured values equal to the special numerical value to a separate class, this class is thus effectively learned as a separate entity state if the special numerical values are not detected in the measured data as disclosed herein. This entity state can be easily misinterpreted as some kind of performance issue, while it is in fact just a configuration state, or in case of strange special value selection, the PM statistics/symptoms of the entity state can be hard to understand for the expert.
  • For the concrete example of unsupervised entity state learning, in case of the variant of FIG. 4A, by checking the flags of the measured values classified into a specific entity state and seeing that for all of them a given measurement's value is 1, the human expert interpreting the analytics results can conclude that the entity state corresponds to a configuration state. This way the method really helps the interpretation of the data analytics results. In case of variant of FIG. 4B, it is even more straightforward, as the data analytics results to be interpreted are already for a specific configuration-related profile.
  • It is of outmost importance to make the interpretation of data analytics results for humans as easy as possible, especially in case of unsupervised analytics like in the above example, because interpretability and explainability of the data analytics results is an essential requirement, for example for adoption and configuration of ML (machine learning) model.
  • FIG. 5 shows a flowchart of a method for detecting a special numerical value according to an example.
  • The steps of the method may be implemented by an apparatus according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • The method concerning the processing applied for one specific measurement parameter to the time series of measured values obtained for this measurement parameter. The method may be applied for several parameters, by performing the steps independently for each measurement parameter.
  • In this method several parameters may be used, in which:
    • L is a lower threshold for the length of long same-value sequences, where L is expressed in number of time steps;
    • Wch is a length of a sliding window W, expressed in number of time steps;
    • Nch is a lower threshold for the number of value changes in the sliding window W;
    • Qth is an upper threshold for the relative frequency of the detected special value v0.
  • In step 500, the variables S and N are initialized.
  • S is the set of measurement values occurring in same-value sequences of length at least equal to L. S is initialized to the empty set.
  • N is the number of sliding windows of length Wch in which the number of value changes is at least Nch. N is initialized to zero.
  • Step 510 is performed for each network entity for which time series of measures values are obtained. Step 510 includes steps 511 and 512.
  • In step 511, the set S is updated by adding to the set S the measured value of each same-value sequence in the time series whose length is at least L provided that this measured value is not yet in the set S.
  • In step 512, the value of N is updated by incrementing N with the number of sliding windows of length Wch over the time series in which the number of changes is at least Nch for a given network entity.
  • Step 520 is performed when the steps 511 and 512 have been performed for all network entities for which the time series of measures values are obtained. In step 520, the first condition #1 is tested. The first condition #1 is verified if the set S includes only one value, noted v0. If the first condition #1 is verified, step 530 is executed after step 520. Otherwise the method ends.
  • If the first condition #1 is verified, this means that there is at least one long same-value sequence of length >= L with the value v0. If there are several long same-value sequences, then all have the same value v0 independently of when and for which network entity they occurred. If the condition #1is met, then the single value v0 in S is kept as the candidate for special value of the measurement. Otherwise it is determined that the measurements are not conditional measurements and the method terminates.
  • This condition #1 is based on the natural assumption, that if a given measurement is not available for a network entity due to its configuration, then this configuration state lasts for a longer time period, so the time periods with unchanged special value are most likely long.
  • The parameter L is configured to define what is long, in number of measurement time steps. L should be high enough such that a normally measured parameter most likely changes value during any period of this length. With such selection of L, we can assume that all same-value sequences with length >=L are most likely those with a special value due to not available measurement. L may for example be equal to the number of measurement timestamps in one day, because of the cyclic behavior of communication networks that follows the daily periodicity of human activities. When a network entity goes over its whole operational cycle during a day, a normal supported measurement is more likely changing values over this period.
  • However, all the long same-value sequences with length >= L must have the same special value for the measurement, to support the assumption that the measurement collection system fills in always the same special value for when a measurement is not available. This is checked by the requirement that S contains a single value after having processed all network entity's time series.
  • In step 530, it is determined that the single value v0 in the set S is a candidate special value v0.
  • In step 540, the second condition #2 is tested. The second condition #2 is verified if N > 0. If the second condition #2 is verified, step 550 is executed after step 540. Otherwise the method determines that the special value v0 found at step 530 is not used for unsupported measurements and the method ends.
  • If the second condition #2 is verified, then there exists a time window of length Wch in the time series of at least one of the network entities, in which the measurement changes value frequently enough, at least Nch times, so that one can assume that in that time window the measurement was available and measured normally for the network entity.
  • Note, that here the number of value changes are counted, not the number of different values taken during the sliding window. For example there can be many value changes just by switching between two different values.
  • While condition #1 collects evidence, that one or more network entities have long periods when the measurement is not available (represented with the same special value in their time series), condition #2 collects evidence that there one or more periods for some network entity or network entities when the measurement is available and measured normally as indicated by frequent value changes. Checking this condition #2 is done because unsupported measurement means not only that the measurement is not available for network entities and periods when their configuration lacks the required feature, but also that it is available for other network entities and/or other periods.
  • The adjustment of the values of Wch and Nch may be performed in different manners. However, if the value of L is selected to span one day like discussed above, then Wch could be equal to L and Nch be selected such that Nch > Wch / Lmin, where the Lmin is a period length selected such that the configuration of an network entity remains the same during Lmin with high probability. This selection of Nch prevents the misinterpretation of value changes between special value and normal measured value / values which may happen due to configuration changes as normal changes between normal measured values.
  • In step 550, q(v0) is computed over all measured entities. q(v0) is the relative frequency of the candidate special value v0 over all measurements. q(v0) gives what fraction of all the measurements in the time series of all network entities takes the special value candidate special value v0.
  • In step 560, the third condition #3 is tested. The third condition #3 is verified if q(v0) < Qth. If the third condition #3 is verified, step 570 is executed after step 560. Otherwise the method it is determined that the measurements are not conditional measurements and the method ends.
  • The third condition #3 may not be used but it improves the method in specific cases because there are often measurement parameters that most of the time take the same value, even when they are available and measured normally. For example counters of very seldom error events, that most of the time take the value of 0. Based on the first two conditions these type of measurements can be very easily mistaken as unsupported measurements with the special value being their usual value (0 for the error counters example) even when they are available and measured normally for all network entities and over the whole measurement period. As a trade-off, to avoid false positives, these situations are detected by using this third condition #3. The value of Qth can be selected to some value slightly lower than 1, for example between 0.9 and 1 or between 0.99 and 1.
  • In step 570, it is determined that the measurements are conditional measurements using the special value v0 when a measured value is not available.
  • The method is biased toward increasing the reliability that a measurement detected as an unsupported measurement by the method is indeed an unsupported measurement, at the expense of potentially missing the detection of some measurements that are unsupported measurements in reality.
  • In general, a conditional measurement can fall into two categories with respect to its special value:
  1. a) the special value is out of the range of normal measured values, or
  2. b) it can be the result of a normal measurement too.
  • In both cases, the special value is a valid value of the measurement parameter, e.g. a valid floating point or integer value.
  • In case of a category a) the determination whether the measured value was available at a given timestamp for a given network entity is straightforward: if the measured value is equal to the special value, then the measured value was not available (e.g. the corresponding flag value may be equal to 1), otherwise it was available (e.g. the flag value may be equal to is 0).
  • For category b) a measured value equal to the special value does not mean necessarily that the measured value was not available at the timestamp. At these timestamps, the flag value can be still both 0 or 1, while at timestamps where it is not equal to the special value the profile vector element is surely 0.
  • To be able to assign flags to the measured values, it is necessary to infer from the input measured data which of the two categories a time series of measured values belongs to.
  • FIG. 6 shows a flowchart of a method for assigning flags to measured values according to an example.
  • The steps of the method may be implemented by an apparatus according to any example described herein. While the steps are described in a sequential manner, the person skilled in the art will appreciate that some steps may be omitted, combined, performed in different order and / or in parallel.
  • The same threshold L applied to lengths of same-values sequences is used as for the special value detection step (see FIG. 5 and the corresponding description).
  • The method evaluates of a heuristic condition (see step 610) for inferring which of the two categories described above the time series of measured values belongs to. The heuristic condition is based on the observation, that in case of unsupported measurement that falls in category a) (for which the special value is not a normal measured value) there are two possible cases when a same-value sequence with the special value is shorter than the L parameter (what has been used for the special value detection step):
    • case #1: the unsupported measurement period with a configuration state leading to unsupported measurements is at least L, but this unsupported measurement period may just partly fall into the whole measurement time period of the input data: either because the unsupported measurement period has started before the start of the whole measurement period, and only the second part of the measured values is contained in the whole measurement period; or the unsupported measurement period has started before the end of the whole measurement period and only the first part is contained in whole measurement period.
    • case #2: the unsupported measurement period is contained entirely in the whole measurement period, but its length is shorter than the L parameter. This can happen, because the threshold L cannot be selected perfectly. However with a reasonably good selection one can assume that only a small fraction of unsupported measurement periods are shorter than L..
  • In step 600, the values of the variables C0 and CS are determined.
  • C0 is the total count of same-value sequences with the special value v0 over all time series of measured values obtained for all network entities during a time period.
  • CS counts the same-value sequences with the special value v0 over all time series of measured values obtained for all network entities that are shorter than L and corresponds to an unsupported measurement period that is contained entirely in the whole measurement period. These same-value sequences correspond to unsupported measurement period(s) that is (are) contained entirely in the whole measurement period, but has (have) a length shorter than the L parameter, as defined in case #2) above.
  • The same-value sequences with the special value v0, if any, that starts right at the first measurement time step may then be excluded for the determination of CS. Likewise, the same-value sequences with the special value v0, if any, that ends right at the last measurement time step may be excluded for the determination of CS.
  • In step 610, the fraction CS/CO of the same-value sequences with the special value v0 which are shorter than L over the total count is compared with a threshold Pth. Pth is a relative upper threshold for this fraction of the same-value sequences with the special value v0 which are shorter than L. If CS/CO < Pth then step 620 is performed after step 610. If CS/CO > Pth step 630 is performed after step 610. In case CS/CO = Pth step 620 or 630 may be performed after step 610.
  • In the approach used in step 610, it is assumed that for an unsupported measurement in category a) and a reasonably good selection of L, the relative fraction CS/CO is low. Hence, it is determined that if CS/CO < Pth, then the unsupported measurement belongs to category a), (see step 620) and otherwise the unsupported measurement belongs to category b) (see steps 630-650). Pth should be set to a sufficiently small value, for example between 0 and 0.1 or between 0 and 0.01.
  • In step 620, the flags are assigned to the measured values in the time series obtained for the network entities. Here the assignment of flags per timestamp per network entity is straightforward as described earlier: a first flag value (e.g. f=1) is assigned to a measured value m if m=v0. Otherwise a second flag value (e.g. f=0) is assigned to the measured value m.
  • In steps 630-650, a statistical approach is used based on the statistical distribution of the lengths of a same-value sequences with the special value.
  • The length of a same-value sequence with the special value may be used decide whether the same-value sequence is a sequence (referred to as a "not normal" sequence) for which the measurement was not available, or a sequence (referred to as a "normal" sequence) when the special value was the result of normal measurements taking the special value.
  • In step 630, a statistical distribution of the lengths of same-values sequences of the special value v0 that are shorter than L (the same-values sequences taken into account for the computation of CS) is generated.
  • As a heuristic, the statistical distribution may be determined on the same-value sequences whose length is shorter than L, assuming that most of these sequences are "normal" sequences.
  • In step 640, the statistical distribution is used to determine whether the length of a given same-value sequence with the special value v0 is an outlier in the statistical distribution. Here "outlier" means that the length is outstandingly long: the outlier may thus be seen as a high outlier.
  • The method may be configured to estimate the statistical distribution of the lengths of the "normal" sequences and then detect "not normal" sequences whose length is an outlier according to this statistical distribution.
  • The detection of the "normal" sequence length statistical distribution and/or the outlier detection may be performed using various algorithms, e.g. a classification algorithm. It can be based on the basic parameters of the statistical distribution (e.g. mean and/or standard deviation) and then using these parameters to detect the outlier values in the statistical distribution, for example by detecting the length values that fall below a threshold computed based on the mean and standard deviation. It can be based on machine learning algorithms. It can be based for example on a classification method like the random forest.
  • In step 650, the flags are assigned to the measured values in the time series obtained for the network entities. A first flag value (e.g. f=1) is assigned to a measured value m if m=v0 and the length of the same-value sequence including the timestamp t is an outlier in the statistical distribution. Otherwise a second flag value (e.g. f=0) is assigned to the measured value m.
  • FIG. 7 shows a statistical distribution of the lengths of same-value sequences with the special numerical value v0 according to an example. The x axis is the sequence length in number of measurement timestamps. The y axis is the same-value sequence count.
  • The statistical distribution shows a first set 720 of "normal" sequence length values between x=25 and x=80. The statistical distribution shows a second set 710 of "outlier" sequence length values between x=1 and x=3. The length values in the set 720 are outliers in the statistical distribution.
  • A threshold L between the "normal" sequence length values and the "outlier" sequence length values may be set to L=24 or a lower L value (e.g. L>=4). In this example, one can detect the outliers by taking the sequence lengths which are below L=24: this does not introduce much error, because the number of "outlier" sequence length values below L is proportionally very small compared to the total number of sequence length values above L.
  • It should be appreciated by those skilled in the art that any functions, engines, block diagrams, flow diagrams, state transition diagrams, flowchart and / or data structures described herein represent conceptual views of illustrative circuitry embodying the principles of the invention. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes.
  • Although a flow chart may describe operations as a sequential process, many of the operations may be performed in parallel, concurrently or simultaneously. A process may be terminated when its operations are completed but may also have additional steps not disclosed in the figure or description. A process may correspond to a method, function, procedure, subroutine, subprogram, etc. When a process corresponds to a function, its termination may correspond to a return of the function to the calling function or the main function.
  • Each described function, engine, block, step described herein can be implemented in hardware, software, firmware, middleware, microcode, or any suitable combination thereof.
  • When implemented in software, firmware, middleware or microcode, instructions to perform the necessary tasks may be stored in a computer readable medium that may be or not included in a host apparatus or host system. The instructions may be transmitted over the computer-readable medium and be loaded onto the host apparatus or host system. The instructions are configured to cause the host apparatus or host system to perform one or more functions disclosed herein. For example, as mentioned above, according to one or more examples, at least one memory may include or store instructions, the at least one memory and the instructions may be configured to, with at least one processor, cause the host apparatus or host system to perform the one or more functions. Additionally, the processor, memory and instructions, serve as means for providing or causing performance by the host apparatus or host system of one or more functions disclosed herein.
  • The host apparatus or host system may be a general-purpose computer and / or computing system, a special purpose computer and / or computing system, a programmable processing apparatus and / or system, a machine, etc. The host apparatus or host system may be or include or be part of: a user equipment, client device, mobile phone, laptop, computer, network element, data server, network resource controller, network apparatus, router, gateway, network node, computer, cloud-based server, web server, application server, proxy server, etc.
  • FIG. 8 illustrates an example embodiment of an apparatus 9000. The apparatus may be configured to host at least one data consumer entity as disclosed herein. The apparatus may be configured to perform one or several of the methods disclosed herein.
  • As represented schematically, the apparatus 9000 may include at least one processor 9010 and at least one memory 9020. The apparatus 9000 may include one or more communication interfaces 9040 (e.g. network interfaces for access to a wired / wireless network, including Ethernet interface, WIFI interface, etc) connected to the processor and configured to communicate via wired / non wired communication link(s). The apparatus 9000 may include user interfaces 9030 (e.g. keyboard, mouse, display screen, etc) connected with the processor. The apparatus 9000 may further include one or more media drives 9050 for reading a computer-readable storage medium (e.g. digital storage disc 9060 (CD-ROM, DVD, Blue Ray, etc), USB key 9080, etc). The processor 9010 is connected to each of the other components 9020, 9030, 9040, 9050 in order to control operation thereof.
  • The memory 9020 may include a random access memory (RAM), cache memory, non-volatile memory, backup memory (e.g., programmable or flash memories), read-only memory (ROM), a hard disk drive (HDD), a solid state drive (SSD) or any combination thereof. The ROM of the memory 9020 may be configured to store, amongst other things, an operating system of the apparatus 9000 and / or one or more computer program code of one or more software applications. The RAM of the memory 9020 may be used by the processor 9010 for the temporary storage of data.
  • The processor 9010 may be configured to store, read, load, execute and/or otherwise process instructions 9070 stored in a computer- readable storage medium 9060, 9080 and / or in the memory 9020 such that, when the instructions are executed by the processor, causes the apparatus 9000 to perform one or more or all steps of a method described herein for the concerned apparatus 9000.
  • The instructions may correspond to program instructions or computer program code. The instructions may include one or more code segments. A code segment may represent a procedure, function, subprogram, program, routine, subroutine, module, software package, class, or any combination of instructions, data structures or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, arguments, parameters or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable technique including memory sharing, message passing, token passing, network transmission, etc.
  • When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. The term "processor" should not be construed to refer exclusively to hardware capable of executing software and may implicitly include one or more processing circuits, whether programmable or not. A processor or likewise a processing circuit may correspond to a digital signal processor (DSP), a network processor, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a System-on-Chips (SoC), a Central Processing Unit (CPU), an arithmetic logic unit (ALU), a programmable logic unit (PLU), a processing core, a programmable logic, a microprocessor, a controller, a microcontroller, a microcomputer, a quantum processor, any device capable of responding to and/or executing instructions in a defined manner and/or according to a defined logic. Other hardware, conventional or custom, may also be included. A processor or processing circuit may be configured to execute instructions adapted for causing the host apparatus or host system to perform one or more functions disclosed herein for the host apparatus or host system.
  • A computer readable medium or computer readable storage medium may be any tangible storage medium suitable for storing instructions readable by a computer or a processor. A computer readable medium may be more generally any storage medium capable of storing and/or containing and/or carrying instructions and/or data. The computer readable medium may be a non-transitory computer readable medium. The term "non-transitory", as used herein, is a limitation of the medium itself (i.e., tangible, not a signal) as opposed to a limitation on data storage persistency (e.g., RAM vs. ROM).
  • A computer-readable medium may be a portable or fixed storage medium. A computer readable medium may include one or more storage device like a permanent mass storage device, magnetic storage medium, optical storage medium, digital storage disc (CD-ROM, DVD, Blue Ray, etc), USB key or dongle or peripheral, a memory suitable for storing instructions readable by a computer or a processor.
  • A memory suitable for storing instructions readable by a computer or a processor may be for example: read only memory (ROM), a permanent mass storage device such as a disk drive, a hard disk drive (HDD), a solid state drive (SSD), a memory card, a core memory, a flash memory, or any combination thereof.
  • In the present description, the wording "means configured to perform one or more functions" or "means for performing one or more functions" may correspond to one or more functional blocks comprising circuitry that is adapted for performing or configured to perform the concerned function(s). The block may perform itself this function or may cooperate and / or communicate with other one or more blocks to perform this function. The "means" may correspond to or be implemented as "one or more modules", "one or more devices", "one or more units", etc. The means may include at least one processor and at least one memory including computer program code, wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause an apparatus or system to perform the concerned function(s).
  • As used in this application, the term "circuitry" may refer to one or more or all of the following:
    1. (a) hardware-only circuit implementations (such as implementations in only analog and/or digital circuitry) and
    2. (b) combinations of hardware circuits and software, such as (as applicable) : (i) a combination of analog and/or digital hardware circuit(s) with software/firmware and (ii) any portions of hardware processor(s) with software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions); and
    3. (c) hardware circuit(s) and or processor(s), such as a microprocessor(s) or a portion of a microprocessor(s), that requires software (e.g., firmware) for operation, but the software may not be present when it is not needed for operation."
  • This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and/or firmware. The term circuitry also covers, for example and if applicable to the particular claim element, an integrated circuit for a network element or network node or any other computing device or network device.
  • The term circuitry may cover digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), etc. The circuitry may be or include, for example, hardware, programmable logic, a programmable processor that executes software or firmware, and/or any combination thereof (e.g. a processor, control unit/entity, controller) to execute instructions or software and control transmission and receptions of signals, and a memory to store data and/or instructions.
  • The circuitry may also make decisions or determinations, generate frames, packets or messages for transmission, decode received frames or messages for further processing, and other tasks or functions described herein. The circuitry may control transmission of signals or messages over a radio network, and may control the reception of signals or messages, etc., via one or more communication networks.
  • Although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and similarly, a second element could be termed a first element, without departing from the scope of this disclosure. As used herein, the term "and/or," includes any and all combinations of one or more of the associated listed items.
  • The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting. As used herein, the singular forms "a," "an," and "the," are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes," and/or "including," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
  • LIST OF MAIN ABBREVIATIONS


  • AD
    Anomaly Detection
    API
    Application Programming Interface
    CA
    Carrier Aggregation
    ML
    Machine Learning
    PM
    Performance Management
    RAN
    Radio Access Network

    Claims (14)

    1. A computer-implemented method comprising:
      obtaining (300), for each of one or more network entities, a time series of values of a measurement parameter for respective timestamps, the time series of values including measured values and a special numerical value at one or more timestamps, wherein the time series includes the special numerical value at a given timestamp for replacing a value of the measurement parameter when no measured value is available for the measurement parameter at the given timestamp;
      parsing (310) the time series of values to determine which numerical value in the time series of values corresponds to the special numerical value,
      wherein the parsing includes detecting same-values sequences having a minimum length in the time series of values and generating a set of at least one value including the value of each of the detected same-values sequences having the minimum length, wherein if the set of values includes only one value, the sole value in the set of values is identified as being the special numerical value;
      wherein the parsing includes computing a count of value changes occurring in a sliding time window of a given length applied to the time series of values to detect at least one portion of the time series in which a measured value is available for the measurement parameter;
      assigning (320) flags to the values in the time series of values based on the result of the parsing, wherein a flag assigned to a value obtained at a given timestamp indicates whether the measurement was or not available at the given timestamp in the time series of values.
    2. The method according to claim 1, wherein one or more time series of values are obtained respectively for the one or more network entities, wherein the parsing includes determining if the ratio of the number of values in the one or more time series that are equal to the sole value identified as the special numerical value over the number of measured values in the one or more time series is below a threshold.
    3. The method of any of the preceding claims, wherein a portion of the time series in which a measured value is available for the measurement parameter is detected if a count of value changes occurring in the sliding time window is above a threshold for at least one temporal position of the sliding time window.
    4. The method of any of the preceding claims, comprising:
      performing data analytics on the time series of values based on the assigned flags to generate data analytics results.
    5. The method of claim 4, comprising:
      performing an operation on one or more network devices or network function based on the data analytics results.
    6. The method of any of the preceding claims, comprising:
      determining whether the special numerical value is a value out of a normal range of values in which the measured values fall or in the range of values,
      wherein the determining is based on a comparison between a first count of same-values sequences with the special numerical value in time series of values obtained for the one or more network entities and a second count of same-values sequences with the special numerical value in time series of values obtained for one or more network entities that are shorter than a threshold.
    7. The method of claim 6, wherein:
      when the ratio between the second count and the first count is below a threshold, it is determined that the special numerical value is a value out of the normal range of values and wherein a flag assigned to a value is equal to a first flag value for each value in the time series of values that is equal to the special numerical value and a second flag value otherwise.
    8. The method of claim 7, wherein:
      when the ratio between the second count and the first count is above the threshold, it is determined that the special numerical value is a value in the normal range of values;
      wherein the method comprises:
      using a statistical distribution of the lengths of same-values sequences of the special numerical value to detect whether the length of a given same-values sequence with the special value is an outlier in the statistical distribution;
      wherein a flag corresponding to a given timestamp takes a first flag value for each value in the time series of values that is equal to the special numerical value when the length of the same-values sequence including the concerned special numerical value is an outlier in the statistical distribution and a second flag value otherwise.
    9. The method of claim 8, wherein the statistical distribution is determined for the lengths of same-values sequences of the special numerical value that are shorter than the minimum length.
    10. The method of claim 8 or 9, wherein analyzing the statistical distribution is performed using a classification algorithm to detect a presence or absence of at least one length that is an outlier in the statistical distribution.
    11. An apparatus comprising means for performing all the steps of a method according to any of the preceding claims.
    12. An apparatus according to claim 11, wherein the means comprise
      - at least one processor;
      - at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus to perform the method.
    13. A computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out steps:
      obtaining (300), for each of one or more network entities, a time series of values of a measurement parameter for respective timestamps, the time series of values including measured values and a special numerical value at one or more timestamps, wherein the time series includes the special numerical value at a given timestamp for replacing a value of the measurement parameter when no measured value is available for the measurement parameter at the given timestamp;
      parsing (310) the time series of values to determine which numerical value in the time series of values corresponds to the special numerical value,
      wherein the parsing includes detecting same-values sequences having a minimum length in the time series of values and generating a set of at least one value including the value of each of the detected same-values sequences having the minimum length, wherein if the set of values includes only one value, the sole value in the set of values is identified as being the special numerical value;
      wherein the parsing includes computing a count of value changes occurring in a sliding time window of a given length applied to the time series of values to detect at least one portion of the time series in which a measured value is available for the measurement parameter;
      assigning (320) flags to the values in the time series of values based on the result of the parsing, wherein a flag assigned to a value obtained at a given timestamp indicates whether the measurement was or not available at the given timestamp in the time series of values.
    14. A non-transitory computer-readable medium comprising program instructions stored thereon which, when the program is executed by a computer, cause the computer to carry out the steps:
      obtaining (300), for each of one or more network entities, a time series of values of a measurement parameter for respective timestamps, the time series of values including measured values and a special numerical value at one or more timestamps, wherein the time series includes the special numerical value at a given timestamp for replacing a value of the measurement parameter when no measured value is available for the measurement parameter at the given timestamp;
      parsing (310) the time series of values to determine which numerical value in the time series of values corresponds to the special numerical value,
      wherein the parsing includes detecting same-values sequences having a minimum length in the time series of values and generating a set of at least one value including the value of each of the detected same-values sequences having the minimum length, wherein if the set of values includes only one value, the sole value in the set of values is identified as being the special numerical value;
      wherein the parsing includes computing a count of value changes occurring in a sliding time window of a given length applied to the time series of values to detect at least one portion of the time series in which a measured value is available for the measurement parameter;
      assigning (320) flags to the values in the time series of values based on the result of the parsing, wherein a flag assigned to a value obtained at a given timestamp indicates whether the measurement was or not available at the given timestamp in the time series of values.
    EP22206704.3A 2022-11-10 2022-11-10 Data analytics on measurement data Active EP4369679B1 (en)

    Priority Applications (2)

    Application Number Priority Date Filing Date Title
    EP22206704.3A EP4369679B1 (en) 2022-11-10 2022-11-10 Data analytics on measurement data
    US18/483,213 US12335127B2 (en) 2022-11-10 2023-10-09 Data analytics on measurement data

    Applications Claiming Priority (1)

    Application Number Priority Date Filing Date Title
    EP22206704.3A EP4369679B1 (en) 2022-11-10 2022-11-10 Data analytics on measurement data

    Publications (2)

    Publication Number Publication Date
    EP4369679A1 EP4369679A1 (en) 2024-05-15
    EP4369679B1 true EP4369679B1 (en) 2025-05-21

    Family

    ID=84358539

    Family Applications (1)

    Application Number Title Priority Date Filing Date
    EP22206704.3A Active EP4369679B1 (en) 2022-11-10 2022-11-10 Data analytics on measurement data

    Country Status (2)

    Country Link
    US (1) US12335127B2 (en)
    EP (1) EP4369679B1 (en)

    Family Cites Families (23)

    * Cited by examiner, † Cited by third party
    Publication number Priority date Publication date Assignee Title
    US8472328B2 (en) * 2008-07-31 2013-06-25 Riverbed Technology, Inc. Impact scoring and reducing false positives
    WO2014003508A1 (en) 2012-06-29 2014-01-03 엘지전자 주식회사 Method for measuring and reporting csi-rs in wireless communication system, and apparatus for supporting same
    JP5703407B1 (en) * 2014-03-28 2015-04-22 株式会社日立ハイテクノロジーズ Information processing apparatus, information processing method, information system, and program
    US11783046B2 (en) * 2017-04-26 2023-10-10 Elasticsearch B.V. Anomaly and causation detection in computing environments
    US20190102361A1 (en) * 2017-09-29 2019-04-04 Linkedin Corporation Automatically detecting and managing anomalies in statistical models
    DE102017219209B4 (en) * 2017-10-26 2024-08-29 Continental Automotive Technologies GmbH Method for detecting an incorrect time stamp of an Ethernet message and control unit for a motor vehicle
    US20210041525A1 (en) 2018-02-16 2021-02-11 Telefonaktiebolaget Lm Ericsson (Publ) Similarity Metric Customized to Radio Measurement in Heterogeneous Wireless Networks and Use Thereof
    US11860971B2 (en) * 2018-05-24 2024-01-02 International Business Machines Corporation Anomaly detection
    US20210089927A9 (en) * 2018-06-12 2021-03-25 Ciena Corporation Unsupervised outlier detection in time-series data
    US11815882B2 (en) * 2018-09-11 2023-11-14 Throughput, Inc. Industrial bottleneck detection and management method and system
    US11138090B2 (en) * 2018-10-23 2021-10-05 Oracle International Corporation Systems and methods for forecasting time series with variable seasonality
    US10896574B2 (en) * 2018-12-31 2021-01-19 Playtika Ltd System and method for outlier detection in gaming
    US11514004B2 (en) * 2019-03-07 2022-11-29 Salesforce.Com, Inc. Providing a simple and flexible data access layer
    CN114269521B (en) * 2019-08-09 2024-08-02 索尼集团公司 Information processing device, information processing method, program, and robot
    WO2021077293A1 (en) 2019-10-22 2021-04-29 华为技术有限公司 Communication method and apparatus, and device and system
    KR102698188B1 (en) * 2020-02-27 2024-08-22 후아웨이 테크놀러지 컴퍼니 리미티드 Generating and consuming analytics on mobile networks
    US11567926B2 (en) * 2020-03-17 2023-01-31 Noodle Analytics, Inc. Spurious outlier detection system and method
    US20230217280A1 (en) 2020-07-10 2023-07-06 Telefonaktiebolaget Lm Ericsson (Publ) Measurement Triggering Based on Data Traffic
    CA3190677A1 (en) 2020-08-07 2022-02-10 Zte Corporation Method for enhancing wireless communication device measurements
    GB2599698B (en) * 2020-10-09 2022-12-21 Neuville Grid Data Man Limited High-resolution electrical measurement data processing
    JP7672926B2 (en) * 2021-09-02 2025-05-08 日立ヴァンタラ株式会社 Outlier detection device and method
    US11960254B1 (en) * 2022-03-11 2024-04-16 Bentley Systems, Incorporated Anomaly detection and evaluation for smart water system management
    US12013747B2 (en) * 2022-08-10 2024-06-18 International Business Machines Corporation Dynamic window-size selection for anomaly detection

    Also Published As

    Publication number Publication date
    US12335127B2 (en) 2025-06-17
    US20240163195A1 (en) 2024-05-16
    EP4369679A1 (en) 2024-05-15

    Similar Documents

    Publication Publication Date Title
    JP7346176B2 (en) Systems and methods for binned interquartile range analysis in data series anomaly detection
    US10592666B2 (en) Detecting anomalous entities
    US10069684B2 (en) Core network analytics system
    US20190095266A1 (en) Detection of Misbehaving Components for Large Scale Distributed Systems
    US9354960B2 (en) Assigning virtual machines to business application service groups based on ranking of the virtual machines
    US10664837B2 (en) Method and system for real-time, load-driven multidimensional and hierarchical classification of monitored transaction executions for visualization and analysis tasks like statistical anomaly detection
    US8949676B2 (en) Real-time event storm detection in a cloud environment
    US11533217B2 (en) Systems and methods for predictive assurance
    US10831579B2 (en) Error detecting device and error detecting method for detecting failure of hierarchical system, computer readable recording medium, and computer program product
    US11392821B2 (en) Detecting behavior patterns utilizing machine learning model trained with multi-modal time series analysis of diagnostic data
    US8631280B2 (en) Method of measuring and diagnosing misbehaviors of software components and resources
    US11398957B2 (en) Systems and methods for predicting failure of a network device
    EP3808099B1 (en) Real time telemetry monitoring tool
    US11388039B1 (en) Identifying problem graphs in an information technology infrastructure network
    CN113965389B (en) Network security management method, device and medium based on firewall log
    CN114169604A (en) Performance index abnormality detection method, abnormality detection device, electronic apparatus, and storage medium
    Raj et al. Cloud infrastructure fault monitoring and prediction system using LSTM based predictive maintenance
    US20200213203A1 (en) Dynamic network health monitoring using predictive functions
    WO2023022805A1 (en) Intelligent cloud service health communication to customers
    Ali et al. [Retracted] Classification and Prediction of Software Incidents Using Machine Learning Techniques
    WO2022222623A1 (en) Composite event estimation through temporal logic
    EP4369679B1 (en) Data analytics on measurement data
    US10789119B2 (en) Determining root-cause of failures based on machine-generated textual data
    EP4364374A1 (en) Conflict detection in network management
    CN115658441B (en) Method, equipment and medium for monitoring abnormality of household service system based on log

    Legal Events

    Date Code Title Description
    PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

    Free format text: ORIGINAL CODE: 0009012

    STAA Information on the status of an ep patent application or granted ep patent

    Free format text: STATUS: THE APPLICATION HAS BEEN PUBLISHED

    AK Designated contracting states

    Kind code of ref document: A1

    Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR

    STAA Information on the status of an ep patent application or granted ep patent

    Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

    17P Request for examination filed

    Effective date: 20240621

    RBV Designated contracting states (corrected)

    Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR

    GRAP Despatch of communication of intention to grant a patent

    Free format text: ORIGINAL CODE: EPIDOSNIGR1

    STAA Information on the status of an ep patent application or granted ep patent

    Free format text: STATUS: GRANT OF PATENT IS INTENDED

    INTG Intention to grant announced

    Effective date: 20250109

    GRAS Grant fee paid

    Free format text: ORIGINAL CODE: EPIDOSNIGR3

    GRAJ Information related to disapproval of communication of intention to grant by the applicant or resumption of examination proceedings by the epo deleted

    Free format text: ORIGINAL CODE: EPIDOSDIGR1

    GRAL Information related to payment of fee for publishing/printing deleted

    Free format text: ORIGINAL CODE: EPIDOSDIGR3

    STAA Information on the status of an ep patent application or granted ep patent

    Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

    INTC Intention to grant announced (deleted)
    GRAP Despatch of communication of intention to grant a patent

    Free format text: ORIGINAL CODE: EPIDOSNIGR1

    STAA Information on the status of an ep patent application or granted ep patent

    Free format text: STATUS: GRANT OF PATENT IS INTENDED

    GRAA (expected) grant

    Free format text: ORIGINAL CODE: 0009210

    STAA Information on the status of an ep patent application or granted ep patent

    Free format text: STATUS: THE PATENT HAS BEEN GRANTED

    INTG Intention to grant announced

    Effective date: 20250407

    AK Designated contracting states

    Kind code of ref document: B1

    Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR

    REG Reference to a national code

    Ref country code: GB

    Ref legal event code: FG4D

    REG Reference to a national code

    Ref country code: CH

    Ref legal event code: EP

    REG Reference to a national code

    Ref country code: DE

    Ref legal event code: R096

    Ref document number: 602022014917

    Country of ref document: DE

    REG Reference to a national code

    Ref country code: IE

    Ref legal event code: FG4D

    REG Reference to a national code

    Ref country code: NL

    Ref legal event code: MP

    Effective date: 20250521

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: FI

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 20250521

    Ref country code: PT

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 20250922

    Ref country code: ES

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 20250521

    REG Reference to a national code

    Ref country code: LT

    Ref legal event code: MG9D

    点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载