data = dataset("ISLR", "Hitters")
dropmissing!(data, :Salary)
numerical_cols = [col for col in names(data) if eltype(data[!, col]) <: Number]
data = data[:, numerical_cols]
describe(data)
Row | variable | mean | min | median | max | nmissing | eltype |
---|---|---|---|---|---|---|---|
Symbol | Float64 | Real | Float64 | Real | Int64 | DataType | |
1 | AtBat | 403.643 | 19 | 413.0 | 687 | 0 | Int32 |
2 | Hits | 107.829 | 1 | 103.0 | 238 | 0 | Int32 |
3 | HmRun | 11.6198 | 0 | 9.0 | 40 | 0 | Int32 |
4 | Runs | 54.7452 | 0 | 52.0 | 130 | 0 | Int32 |
5 | RBI | 51.4867 | 0 | 47.0 | 121 | 0 | Int32 |
6 | Walks | 41.1141 | 0 | 37.0 | 105 | 0 | Int32 |
7 | Years | 7.31179 | 1 | 6.0 | 24 | 0 | Int32 |
8 | CAtBat | 2657.54 | 19 | 1931.0 | 14053 | 0 | Int32 |
9 | CHits | 722.186 | 4 | 516.0 | 4256 | 0 | Int32 |
10 | CHmRun | 69.2395 | 0 | 40.0 | 548 | 0 | Int32 |
11 | CRuns | 361.221 | 2 | 250.0 | 2165 | 0 | Int32 |
12 | CRBI | 330.418 | 3 | 230.0 | 1659 | 0 | Int32 |
13 | CWalks | 260.266 | 1 | 174.0 | 1566 | 0 | Int32 |
14 | PutOuts | 290.711 | 0 | 224.0 | 1377 | 0 | Int32 |
15 | Assists | 118.76 | 0 | 45.0 | 492 | 0 | Int32 |
16 | Errors | 8.59316 | 0 | 7.0 | 32 | 0 | Int32 |
17 | Salary | 535.926 | 67.5 | 425.0 | 2460.0 | 0 | Float64 |