Decomposes a data frame into several relations, based on the given database schema. It's intended that the data frame satisfies all the functional dependencies implied by the schema, such as if the schema was constructed from the same data frame. If this is not the case, the function will returns an error.
Usage
decompose(
df,
schema,
keep_rownames = FALSE,
digits = getOption("digits"),
check = TRUE
)
Arguments
- df
a data.frame, containing the data to be normalised.
- schema
a database schema with foreign key references, such as given by
autoref
.- keep_rownames
a logical or a string, indicating whether to include the row names as a column. If a string is given, it is used as the name for the column, otherwise the column is named "row". Set to FALSE by default.
- digits
a positive integer, indicating how many significant digits are to be used for numeric and complex variables. A value of
NA
results in no rounding. By default, this usesgetOption("digits")
, similarly toformat
. See the "Floating-point variables" section fordiscover
for why this rounding is necessary for consistent results across different machines. See the note inprint.default
aboutdigits >= 16
.- check
a logical, indicating whether to check that
df
satisfies the functional dependencies enforced byschema
before creating the result. This can find key violations without spending time creating the result first, but is redundant ifdf
was used to createschema
in the first place.
Value
A database
object, containing the data in df
within the database schema given in schema
.
Details
If the schema was constructed using approximate dependencies for the same
data frame, decompose
returns an error, to prevent either duplicate records
or lossy decompositions. This is temporary: for the next update, we plan to
add an option to allow this, or to add "approximate" equivalents of databases
and database schemas.