Skyline
Processing
Assume records with attributes, each taking values from a
totally ordered domain. Accordingly, a record can be represented as a point in
the d-dimensional space (in the sequel, we use the terms record, point, and
object
Interchangeably). The skyline contains the best points
according to any function that is monotonic on each attribute. Conversely, for
each skyline record r, there is such a function that would assign it the
highest score. These attractive properties of skylines have led to their
application in various domains including multi objective optimization maximum
vectors and the contour problem. They introduced the skyline operator to the
database literature and proposed two disk-based algorithms for large data sets.
The first, called D&C (for divide and conquer) divides the data set into
partitions that fit in memory, computes the partial skyline in every partition,
and generates the final skyline by merging the partial ones. The second
algorithm, called BNL, applies the concept of block-nested loops. It improves
BNL by sorting the data. Other variants of all these methods do not use any
indexing and, usually, they have to scan the entire data set before reporting
any skyline point. Another set of algorithms utilizes conventional or
multidimensional indexes to speed up query processing and progressively report
skyline points. Such methods include Bitmap, In addition to conventional
databases; skyline processing has been studied in other scenarios. For
instance, Morse et al. uses spatial access methods to maintain the skyline in
streams with explicit deletions. Efficient skyline maintenance has also been
the focus of in distributed environments; several methods query independent
subsystems, each in charge of a specific attribute, and compute the skylines
using the partial results. In the data mining context, identify the
combinations
No comments:
Post a Comment