Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terms 2.0: son of Terms #71

Merged
merged 193 commits into from
Mar 10, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
193 commits
Select commit Hold shift + click to select a range
bcaa1cf
notes on formula2.0
kleinschmidt Jul 31, 2018
d80d437
is_special guard
kleinschmidt Aug 1, 2018
d11e2dd
document is_special
kleinschmidt Aug 1, 2018
9676c0c
WIP Term types
kleinschmidt Aug 1, 2018
bedcb89
operators on AbstractTerms
kleinschmidt Aug 1, 2018
2fcf9dd
overwrite expression with FormulaTerm constructor
kleinschmidt Aug 2, 2018
2a6978c
create Terms as last stage of formula macro
kleinschmidt Aug 2, 2018
f821c8a
schema for terms
kleinschmidt Aug 3, 2018
378e35a
handle hints
kleinschmidt Aug 3, 2018
5486213
show methods
kleinschmidt Aug 3, 2018
f83d977
fallback apply_schema for schema dict
kleinschmidt Aug 3, 2018
e9b15dc
FullRank schema; move schemas to schema.jl
kleinschmidt Aug 4, 2018
5bb53bf
note on anon functions
kleinschmidt Aug 4, 2018
d3840ff
WIP alternative FunctionTerm representation
kleinschmidt Aug 4, 2018
2b7422c
more debugging messages
kleinschmidt Aug 4, 2018
ed2855a
use (syms...)-> form for FunctionTerms anon functions; fix order
kleinschmidt Aug 4, 2018
6f22a5c
how to "call" function term
kleinschmidt Aug 4, 2018
ff01576
add getindex and inverse index field to ContrastsMatrix
kleinschmidt Aug 4, 2018
3d38114
fix up/add missing schema methods
kleinschmidt Aug 4, 2018
f0a0d6b
evaluate continuous and categorical terms
kleinschmidt Aug 4, 2018
ae86664
update notes
kleinschmidt Aug 4, 2018
16ec078
don't call terms but use model_cols; exports
kleinschmidt Aug 5, 2018
ed0f779
width methods, exports
kleinschmidt Aug 5, 2018
cfd1c05
rename nt_anon to capture_call_ex!, docstrings
kleinschmidt Aug 5, 2018
6e23e8c
convenience methods: schema from data.table, apply_schema to formula
kleinschmidt Aug 7, 2018
bde01ac
view, don't copy
kleinschmidt Aug 7, 2018
6d16fa5
termnames
kleinschmidt Aug 7, 2018
b1711cb
WIP handle one-sided formula
kleinschmidt Aug 7, 2018
ce74e46
WIP replace model frame
kleinschmidt Aug 7, 2018
5f06e4b
remove dead code from formula.jl
kleinschmidt Aug 7, 2018
a81c614
tweaking schema: use vector instead of set, default to categorical
kleinschmidt Aug 7, 2018
98c51df
add invindex to fulldummy, handle intercepterterm{false}
kleinschmidt Aug 7, 2018
cd30202
add width method for interaction terms
kleinschmidt Aug 7, 2018
5ffb9a2
exports, dispatch/datastreams, bugs in modelmatrix.assign
kleinschmidt Aug 7, 2018
baa8a89
WIP adapting tests to new setup
kleinschmidt Aug 7, 2018
e23af0b
add termnames and model_cols for formula term; compute catdims
kleinschmidt Aug 8, 2018
57ce097
escape special calls
kleinschmidt Aug 8, 2018
7e32835
add ConstantTerm, convert to InterceptTerm at schema time
kleinschmidt Aug 10, 2018
5f31cf8
add a few methods to minimic the old Terms fields
kleinschmidt Aug 10, 2018
ee87cad
generalized drop_term and term() convenience function
kleinschmidt Aug 10, 2018
20e4e75
"safety" fall back for term(::AbstractTerm)
kleinschmidt Aug 10, 2018
9e0c061
fixes for apply_schema and ConstantTerm
kleinschmidt Aug 11, 2018
41b7293
delete dead Formula code
kleinschmidt Aug 11, 2018
1ca9e43
WIP fixing up tests: df->d, new formula struct, exports
kleinschmidt Aug 11, 2018
b88037e
notes?
kleinschmidt Aug 11, 2018
dd45f5d
compat cleanup
kleinschmidt Aug 16, 2018
cafd34d
compat cleanup
kleinschmidt Aug 16, 2018
a296f58
remove three-arg MF, add missing_omit before schema gen
kleinschmidt Aug 16, 2018
4f4478d
fix deepcopy in tests
kleinschmidt Aug 16, 2018
4d4c342
tuple syntax
kleinschmidt Aug 16, 2018
bcdacba
fallback on modelmatrix to convert vector to mat
kleinschmidt Aug 16, 2018
92d5bca
reshape to ensure model matrix is a matrix
kleinschmidt Aug 18, 2018
323a2de
add contrasts kwarg for ModelFrame
kleinschmidt Aug 18, 2018
f0e7439
WIP update tests for column ordering in interaction terms
kleinschmidt Aug 18, 2018
180e7a7
utility function to do rowwise "kronecker" product (inside-out)
kleinschmidt Aug 18, 2018
b922531
use [row_]kron_insideout for interaction terms model_cols/termnames
kleinschmidt Aug 18, 2018
74a4b20
add test for arbitrary functions
kleinschmidt Aug 27, 2018
6b452b3
use termvars instead of termsyms for modelframe constructor
kleinschmidt Aug 27, 2018
ad96fc7
vectorize coefnames and make modelmatrix tests pass
kleinschmidt Aug 27, 2018
f226780
move inside out krons to terms.jl
kleinschmidt Aug 29, 2018
99fdc96
fix indentation and remove/rename old modelframe/matrix files
kleinschmidt Aug 29, 2018
c8a05fb
remove notes file
kleinschmidt Aug 29, 2018
71e0bb1
don't try to include this deleted file
kleinschmidt Aug 29, 2018
11ebe7d
add some docstrings and fix one dangling where var
kleinschmidt Sep 6, 2018
e917aa6
get rid of is_special
kleinschmidt Oct 12, 2018
f447dad
add parsed args to FormulaTerm/capture_call and remove is_special
kleinschmidt Oct 12, 2018
08b4907
tidy up associative a bit
kleinschmidt Oct 12, 2018
8d42d5b
fix broadcasting on 1.0 with Ref wrappers
kleinschmidt Sep 6, 2018
cb146a2
WIP three-arg apply_schema, and fit interface
kleinschmidt Oct 9, 2018
23d3476
drop dataframes/datastreams for Tables
kleinschmidt Oct 14, 2018
5b2d098
eh who needs subtraction
kleinschmidt Oct 14, 2018
b9bae16
adding terms to tuples and hasnointercept
kleinschmidt Oct 14, 2018
affd434
use model type when applying schema as well
kleinschmidt Oct 14, 2018
3b90958
boo "DataFrames" 👎 hooray "Tables" 🎉
kleinschmidt Oct 14, 2018
d9b4bd1
formula terms need schema because pessimism
kleinschmidt Oct 16, 2018
2b7c5be
default to StatsModels.StatisticalModel for ModelFrame model type
kleinschmidt Oct 16, 2018
9cd254e
don't test deprecated constructor for old Formula type
kleinschmidt Oct 16, 2018
17f7154
escape anonymous function generated by capture_call
kleinschmidt Oct 16, 2018
43b8a77
tests for extending formula with special terms and models
kleinschmidt Oct 16, 2018
49f004b
put non-redundant term tests in own testset
kleinschmidt Oct 16, 2018
73b8ac9
export FunctionTerm
kleinschmidt Oct 16, 2018
2f8d76c
update travis script
kleinschmidt Oct 16, 2018
b2e6322
use DataStructures.SortedDict to sort levels of CategoricalTerms
kleinschmidt Oct 16, 2018
b482689
keep track of model type in ModelFrame
kleinschmidt Oct 16, 2018
2962768
schema fixes: apply to cont/cat, get/merge from FullRank wrapper
kleinschmidt Oct 16, 2018
0c5d7f8
implement setcontrasts!
kleinschmidt Oct 16, 2018
4a4ca4e
fix ordering of added intercept term for statistical model
kleinschmidt Oct 16, 2018
0f5638e
clean up/update contrasts tests
kleinschmidt Oct 16, 2018
9be3301
ensure that args to kron_insideout are vectors
kleinschmidt Oct 16, 2018
ee4a9cb
check for implicit intercept trait in elseif after drop_intercept
kleinschmidt Oct 17, 2018
ec8bb33
factor out missing_omit for term
kleinschmidt Oct 17, 2018
97a2526
update statsmodels and tests
kleinschmidt Oct 17, 2018
b4770b2
make a note of changed behavior for mismatching levels; update test
kleinschmidt Oct 17, 2018
6f5e43c
move ContrastsMatrix invindex to inner constructor, make immutable
kleinschmidt Oct 17, 2018
b6e8c03
WIP code review; add missing method for schema, docstring w/ example
kleinschmidt Oct 17, 2018
b4efc36
specialize fields for FullRank wrapper
kleinschmidt Oct 17, 2018
30602fd
more code review/cleanup: precompile, notes/comments, docs, tuples
kleinschmidt Oct 17, 2018
5cd6f05
review: one-line wheres, docstrings, model_cols
kleinschmidt Oct 17, 2018
5e72aed
docstring quoting
nalimilan Nov 1, 2018
39b3065
test that no longer fails
kleinschmidt Nov 1, 2018
b980d4e
WIP code review changes
kleinschmidt Nov 3, 2018
f6503e8
test implicit intercept explicitly
kleinschmidt Nov 12, 2018
2d3ae70
remove onlinestats dependency
kleinschmidt Nov 8, 2018
a629173
fix mixed up RHS/LHS in hasresponse
kleinschmidt Nov 13, 2018
b4d4c25
more code review: tests, comments, non-dot assignment
kleinschmidt Nov 14, 2018
6ec445b
change printing of categorical terms and add some terms tests
kleinschmidt Nov 14, 2018
6ec9c00
more code review: selects, missings, generators
kleinschmidt Nov 14, 2018
e478781
add a MatrixTerm container for terms to make a single matrix
kleinschmidt Nov 16, 2018
d7a3a9f
model_cols(::MatrixTerm, d) always gives a matrix
kleinschmidt Nov 16, 2018
6484beb
no longer necessary to convert reshape to matrix in ModelMatrix
kleinschmidt Nov 16, 2018
fcaf04e
export MatrixTerm and just show the terms
kleinschmidt Nov 16, 2018
2a0d6e3
remove missing_omit helper functions
kleinschmidt Nov 17, 2018
a1d0160
a stub for formula documentation
kleinschmidt Nov 17, 2018
a550608
use travis jobs and --project=docs/ for documenter.jl
kleinschmidt Nov 17, 2018
cf33d30
WIP updating docs
kleinschmidt Nov 19, 2018
46e2e2f
model_matrix/response convencience functions, exports, TupleTerm
kleinschmidt Dec 18, 2018
cdda548
docs: example of formula, brief explanation of components
kleinschmidt Dec 18, 2018
e128ef8
tupleterm
kleinschmidt Dec 18, 2018
a4abc6c
docs: tweaks, non-DSL calls, lifecycle/macro time
kleinschmidt Dec 18, 2018
58929e1
widen signature for schema of terms
kleinschmidt Dec 18, 2018
d5518d1
WIP documenting schema time
kleinschmidt Dec 19, 2018
03fe509
move lifecycle to internals
kleinschmidt Jan 6, 2019
1cf5a79
WIP documenting API (docstrings and an api.md page)
kleinschmidt Jan 12, 2019
47ea2cf
use repl instead of jldoctest for examples
kleinschmidt Jan 12, 2019
4707e8b
edits to internals and WIP on extending
kleinschmidt Jan 12, 2019
70fe605
docs for schema, coefnames (nee termnames), term
kleinschmidt Jan 12, 2019
c7c9fec
re-organize api.md
kleinschmidt Jan 14, 2019
3cdfe88
use repl blocks in formula.md and internals.md
kleinschmidt Jan 14, 2019
29fb4c0
cleanup: docs for wrappers, some small edits
kleinschmidt Jan 14, 2019
3625e69
typo
kleinschmidt Jan 14, 2019
fb73955
docs for model_cols
kleinschmidt Jan 14, 2019
81df52b
print & with spaces (like old), examples
kleinschmidt Jan 14, 2019
c19178c
editing internals.md
kleinschmidt Jan 14, 2019
7328df1
model_matrix for any term, using extract_matrix_terms by default
kleinschmidt Jan 15, 2019
e80a1c7
docs with simulated data and fit(LinearModel)
kleinschmidt Jan 15, 2019
5ecd904
document fit
kleinschmidt Jan 15, 2019
0bb18ed
tweak signature in docstrng for apply_schema
kleinschmidt Jan 15, 2019
5305ed3
fix bug in parsing/capture_call and update docs for custom syntax
kleinschmidt Jan 15, 2019
d5aab3e
update docs index.md
kleinschmidt Jan 16, 2019
5f27f15
tweak internals intro docs
kleinschmidt Jan 16, 2019
7445611
use StatsBase.response/modelmatrix, drop Missings
kleinschmidt Jan 16, 2019
eeebb59
actually do need missings for Missings.T
kleinschmidt Jan 16, 2019
0982ad7
edits to docs from review
nalimilan Jan 18, 2019
945eb95
edits suggested in review by @nalimilan
kleinschmidt Jan 18, 2019
86a7970
fix example of apply_schema in api note
kleinschmidt Jan 18, 2019
1a4a9f6
update documenter version and fill out docs/Project.toml
kleinschmidt Feb 7, 2019
0d08c29
use @ repl block for GLM example
kleinschmidt Feb 7, 2019
7df9545
use IOContext(:limit=>false) for verbose, multiline printing
kleinschmidt Feb 7, 2019
e3d477d
also add stdlib packages to REQUIRE?
kleinschmidt Feb 7, 2019
3ef7e37
fix bug with kron_insideout for single-row model_cols
kleinschmidt Feb 7, 2019
cceb40c
fix missing example of single-term model_cols
kleinschmidt Feb 7, 2019
6a143bc
use 3-arg show(io, mime, x) for verbose printing of terms
kleinschmidt Feb 7, 2019
b1dfb50
missing using Tables in test
kleinschmidt Feb 7, 2019
25f8e3c
test show methods with mime
kleinschmidt Feb 7, 2019
1fadbc3
mime show for functions, poly term examples with GLM, whitespace
kleinschmidt Feb 7, 2019
4c42caa
Apply suggestions from @nalimilan code review
nalimilan Feb 7, 2019
b901a94
more code review
nalimilan Feb 9, 2019
e99fb00
more code review changes
kleinschmidt Feb 7, 2019
70269de
use doctests
kleinschmidt Feb 7, 2019
06adddb
Mod::Type
kleinschmidt Feb 7, 2019
ba4ee8d
using statsbase for doctests, make FullRank immutable
kleinschmidt Feb 7, 2019
59c8cf2
split schema(::Term) into concrete_term
kleinschmidt Feb 8, 2019
ad79f36
[extract->collect]_matrix_terms, more doctest filters, edits
kleinschmidt Feb 9, 2019
4681d82
docstring edits, functionterm names
kleinschmidt Feb 9, 2019
fe81c38
get rid of names field in function term (use type parameter)
kleinschmidt Feb 11, 2019
d6e6d68
don't rely on autobroadcasting in coefnames, use [] (for vcat)
kleinschmidt Feb 11, 2019
b32a56c
don't reduce(vcat, (...)), use [...]
kleinschmidt Feb 11, 2019
c61ec7b
One -> onet in test
kleinschmidt Feb 11, 2019
ca5dc21
cleanup tests
kleinschmidt Feb 11, 2019
9f7055d
fix docstring for formulaterm
kleinschmidt Feb 11, 2019
39b413a
get rid of stdlib in require
kleinschmidt Feb 11, 2019
53c24b7
model_cols -> modelcols
kleinschmidt Feb 13, 2019
1452e81
fix docfilter
kleinschmidt Feb 19, 2019
961edb5
remove auto-broadcasting term(args...) method
kleinschmidt Feb 20, 2019
af903f0
omitsintercept and tests, docs cleanup, editing non-DSL docs
kleinschmidt Feb 21, 2019
fd72a68
consolidate show methods, use kwarg instead of IOContext for prefix
kleinschmidt Feb 21, 2019
a2d615a
use something and prefix=nothing for showing tuple of terms
kleinschmidt Feb 21, 2019
a2856c5
final code review stuff
kleinschmidt Feb 21, 2019
01cb49f
Apply suggestions from code review (@nalimilan)
nalimilan Feb 21, 2019
5b86685
don't test on 0.7, do test on 1.1
kleinschmidt Feb 21, 2019
1240a6f
use julia 1.1 for documentation stage
kleinschmidt Feb 21, 2019
a55c82b
rename kwarg for ModelFrame from mod to model
kleinschmidt Feb 22, 2019
0245dd3
WIP converting to doctests
kleinschmidt Feb 22, 2019
003e5ea
doctests in internals docs
kleinschmidt Feb 22, 2019
617b7a9
doctests for formula.md too
kleinschmidt Feb 22, 2019
909eae2
don't use "DSL" in docs/src/formula.md
kleinschmidt Feb 24, 2019
4344524
code review: update docs/src/formula.md
mschauer Mar 4, 2019
4c870e5
don't use "DSL" in docs/src/formula.md
kleinschmidt Feb 24, 2019
a89b515
broken but for a different reason :)
kleinschmidt Feb 24, 2019
f8249da
clarify continuous terms intro
kleinschmidt Mar 4, 2019
5b2d211
some doctest fixes?
kleinschmidt Mar 4, 2019
3af99b7
Merge branch 'master' into dfk/terms2.0
kleinschmidt Mar 4, 2019
7841d37
require 1.0, test on 1.[0,1] on appveyor, fix old naked formula
kleinschmidt Mar 4, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 14 additions & 6 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,23 @@ os:
- linux
- osx
julia:
- 0.7
- 1.0
- 1.1
- nightly
matrix:
allow_failures:
julia: nightly
notifications:
email: false

after_success:
# push coverage results to Codecov and Coveralls
- julia -e 'using Pkg; Pkg.add("Coverage"); using Coverage; Coveralls.submit(process_folder()); Codecov.submit(process_folder())'
# Update the documentation
- julia -e 'using Pkg; Pkg.add("Documenter")'
- julia -e 'using Pkg; include(joinpath("docs", "make.jl"))'
jobs:
include:
- stage: "Documentation"
julia: 1.1
os: linux
script:
- julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd()));
Pkg.instantiate()'
- julia --project=docs/ docs/make.jl
after_success: skip
7 changes: 5 additions & 2 deletions REQUIRE
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
julia 0.7-beta
DataFrames 0.15.0
julia 1.0
CategoricalArrays
Tables
StatsBase 0.22.0
DataStructures
Missings
2 changes: 1 addition & 1 deletion appveyor.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
environment:
matrix:
- julia_version: 0.7
- julia_version: 1.0
- julia_version: 1.1
- julia_version: nightly

platform:
Expand Down
10 changes: 10 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[deps]
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
GLM = "38e38edf-8417-5370-95a0-9cbb8c7f171a"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
StatsModels = "3eaba693-59b7-5ba5-a881-562e759f1c8d"
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"

[compat]
Documenter = "~0.21"
11 changes: 4 additions & 7 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,19 +1,16 @@
using Documenter, StatsModels

makedocs(
format = :html,
sitename = "StatsModels.jl",
pages = [
"Introduction" => "index.md",
"Modeling tabular data" => "formula.md",
"Contrast coding categorical variables" => "contrasts.md"
"Internals and extending the `@formula`" => "internals.md",
"Contrast coding categorical variables" => "contrasts.md",
"API documentation" => "api.md"
]
)

deploydocs(
julia = "0.6",
repo = "github.com/JuliaStats/StatsModels.jl.git",
target = "build",
deps = nothing,
make = nothing
repo = "github.com/JuliaStats/StatsModels.jl.git"
)
93 changes: 93 additions & 0 deletions docs/src/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
```@meta
CurrentModule = StatsModels
DocTestSetup = quote
using StatsModels, Random, StatsBase
Random.seed!(2001)
end
DocTestFilters = [r"([a-z]*) => \1", r"getfield\(.*##[0-9]+#[0-9]+"]
```

# StatsModels.jl API

## Formulae and terms

```@docs
@formula
term
coefnames
modelcols
```

### Higher-order terms

```@docs
FormulaTerm
InteractionTerm
FunctionTerm
```

### Placeholder terms

```@docs
Term
ConstantTerm
```

### Concrete terms

These are all generated by [`apply_schema`](@ref).

```@docs
ContinuousTerm
CategoricalTerm
InterceptTerm
MatrixTerm
collect_matrix_terms
is_matrix_term
```

## Schema

```@docs
schema
concrete_term
apply_schema
```

## Modeling

```@docs
fit
response
modelmatrix
```

### Traits

```@docs
StatsModels.implicit_intercept
StatsModels.drop_intercept
```

### Wrappers

!!! warning

These are internal implementation details that are likely to change in the
near future. In particular, the `ModelFrame` and `ModelMatrix` wrappers are
dispreferred in favor of using terms directly, and can in most cases be
replaced by something like

```julia
# instead of ModelMatrix(ModelFrame(f::FormulaTerm, data, model=MyModel))
sch = schema(f, data)
f = apply_schema(f, sch, MyModel)
response, predictors = modelcols(f, data)
```

```@docs
ModelFrame
ModelMatrix
StatsModels.TableStatisticalModel
StatsModels.TableRegressionModel
```
Loading