This note replicates the analysis of the wines sensory dataset presented by Hanafi & Quanari (2008):
Hanafi, M., Qannari, E.M., 2008. Nouvelles propriétés de l’analyse en composantes communes et poids spécifiques. Journal de la société française de statistique 149, 75–97
where the ComDim method (Common components and specific weights analysis; aka CCSWA) is used to explore the data.
Wines sensory dataset: A jury of four expert tasters ("judges") evaluated the appearance of eight wines according to the procedure known as free profile. Each judge notes on a scale from 0 to 10 the products according to his/her own variables. For a product and a given variable, the note given by a judge is the intensity on which he/she perceives the variable. Each matrix Xk (k = 1, ..., 4) is associated to one judge.
The goal of the ComDim study was to evaluate if there was an agreement between judges or groups of judges and to assess the relationships among the eight wines.
using Jchemo, JchemoData using JLD2, CairoMakie
path_jdat = dirname(dirname(pathof(JchemoData))) db = joinpath(path_jdat, "data/wines_sensory.jld2") @load db dat @names dat
(:X1, :X2, :X3, :X4)
X1 = dat.X1 X2 = dat.X2 X3 = dat.X3 X4 = dat.X4 Xbl = [X1, X2, X3, X4] # Same as: Xbl = [[X1]; [X2]; [X3]; [X4]]
4-element Vector{DataFrame}: 8×4 DataFrame Row │ red gilded soft plum │ Float64 Float64 Float64 Float64 ─────┼──────────────────────────────────── 1 │ 7.0 0.0 5.0 8.0 2 │ 5.0 6.0 6.0 3.0 3 │ 7.0 2.0 5.0 5.0 4 │ 5.0 7.0 7.0 4.0 5 │ 5.0 7.0 6.0 4.0 6 │ 6.0 8.0 6.0 1.0 7 │ 5.0 4.0 10.0 3.0 8 │ 6.0 6.0 6.0 5.0 8×3 DataFrame Row │ ruby coloured intensity │ Float64 Float64 Float64 ─────┼────────────────────────────── 1 │ 4.0 0.0 5.0 2 │ 3.0 6.0 5.0 3 │ 3.0 3.0 7.0 4 │ 1.0 6.0 3.0 5 │ 2.0 5.0 5.0 6 │ 1.0 5.0 4.0 7 │ 0.0 4.0 2.0 8 │ 2.0 6.0 4.0 8×4 DataFrame Row │ red blue gilded intensity │ Float64 Float64 Float64 Float64 ─────┼────────────────────────────────────── 1 │ 7.0 4.0 2.0 6.0 2 │ 2.0 0.0 6.0 6.0 3 │ 6.0 3.0 4.0 7.0 4 │ 2.0 0.0 6.0 4.0 5 │ 5.0 1.0 5.0 6.0 6 │ 3.0 0.0 5.0 5.0 7 │ 2.0 0.0 4.0 3.0 8 │ 4.0 0.0 4.0 5.0 8×3 DataFrame Row │ deep expenses brilliant │ Float64 Float64 Float64 ─────┼────────────────────────────── 1 │ 9.0 7.0 9.0 2 │ 8.0 6.0 7.0 3 │ 10.0 6.0 7.0 4 │ 7.0 7.0 8.0 5 │ 8.0 7.0 8.0 6 │ 8.0 8.0 10.0 7 │ 6.0 5.0 10.0 8 │ 8.0 9.0 10.0
n = nro(X1) nbl = length(Xbl)
4
Below, each block is scaled by its Frobenius norm, and three global components (scores) are computed
nlv = 3 bscal = :frob model = comdim(; nlv, bscal) fit!(model, Xbl) fitm = model.fitm @names fitm
(:T, :U, :W, :Tb, :Tbl, :Vbl, :lb, :mu, :fitmbl, :weights, :niter, :par)
Global scores
@head T = fitm.T
3×3 Matrix{Float64}: -2.50761 0.335161 -0.297369 0.644503 -0.692286 0.247392 -1.564 -0.540373 0.219861 ... (8, 3)
i = 1 f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, xlabel = string("PC", i), ylabel = string("PC", i + 1)) text!(ax, T[:, i], T[:, i + 1]; text = string.("w", 1:n), fontsize = 15) f
Block scores and block loadings
Tbl = fitm.Tbl
4-element Vector{Matrix}: [-1.7962627913744986 -0.008918615631455138 -0.22665348117393225; 0.3879039556775733 -0.1065661506250574 0.2127782058108359; … ; 0.30099461206273337 -0.585749646663945 -1.0867198878257494; 0.035545038805747414 0.3417179548041017 0.07458461012188208] [-1.7044871814196665 0.5357943376354091 -0.4170589964322815; 0.1995449497670047 -0.6605733548084997 0.400922304133755; … ; 0.6150299861716623 0.9086489000953766 -1.0021145160270302; 0.5308334812361191 -0.13762360842297028 0.5902444200693617] [-1.6933885952738224 0.2296343453843308 -0.2616132417652677; 0.7329488213391452 -0.5749914914629086 0.17354614604739924; … ; 0.7929766870562747 0.48867740320692926 -0.7455120724195884; 0.12923331248947625 0.32227484677350104 0.1879793984911528] [-0.40801692432226466 0.46884767588169035 -0.36141026079567345; -0.2773904635616495 -0.9746086560813164 0.26113408530961424; … ; 1.0769677647773832 -0.11436320965711218 -1.2895662379308004; 0.2856187372760472 1.1470867080820784 0.436153894141458]
Vbl = fitm.Vbl
4-element Vector{Matrix}: [-0.23221350649942402 0.5891168732065994 0.07642515391317986; 0.7607464762538525 0.6064931208295313 0.674847582189589; 0.3011913315908775 -0.13118586846138441 -0.7251746631780844; -0.5259519636251038 0.5175883229032195 -0.11340919075085852] [-0.4537703469097276 -0.3697957744699179 0.38548166419002783; 0.7769886790381423 -0.31753284623159983 0.625096894897853; -0.4363267868375587 -0.8731689279562419 0.6787177311389095] [-0.6612798979736141 0.4222002594039563 0.3219773142067391; -0.5710752899312963 -0.15488207530536102 -0.08580550040851531; 0.37567655711660647 -0.7638562292199929 0.48684592681033495; -0.30894827107285366 -0.46290619437435704 0.8074336311945677] [-0.9493374803550765 -0.08204958567684245 0.5908887482427636; 0.0743842707357129 0.6405670458917806 0.6692544240608542; 0.3053282310239113 0.7635062050879062 -0.45049861606343977]
For block k
k = 1 Tbl[k]
8×3 Matrix{Float64}: -1.79626 -0.00891862 -0.226653 0.387904 -0.106566 0.212778 -0.945808 -0.110159 0.124653 0.534977 0.161467 -0.00130687 0.452331 0.206683 0.432083 1.03032 0.101524 0.470582 0.300995 -0.58575 -1.08672 0.035545 0.341718 0.0745846
Vbl[k]
4×3 Matrix{Float64}: -0.232214 0.589117 0.0764252 0.760746 0.606493 0.674848 0.301191 -0.131186 -0.725175 -0.525952 0.517588 -0.113409
res = summary(model, Xbl) @names res
(:explvarx, :explvarxx, :explxbl, :psal2, :contrxbl2t, :rvxbl2t, :rdxbl2t, :cortbl2t, :corx2t)
Proportion of the total XX' inertia explained by the global scores
res.explvarxx
Row | lv | var | pvar | cumpvar |
---|---|---|---|---|
Int64 | Float64 | Float64 | Float64 | |
1 | 1 | 1.54101 | 0.681971 | 0.681971 |
2 | 2 | 0.229883 | 0.101734 | 0.783705 |
3 | 3 | 0.173808 | 0.0769181 | 0.860623 |
Proportion of the inertia of each block (= Xbl[k]) explained by the global scores
explxbl = res.explxbl # = specific weights 'lb' when 'bscal = :frob'
Row | lv1 | lv2 | lv3 |
---|---|---|---|
Float64 | Float64 | Float64 | |
1 | 0.670406 | 0.0117629 | 0.191347 |
2 | 0.663954 | 0.0408196 | 0.216385 |
3 | 0.780095 | 0.0605821 | 0.109844 |
4 | 0.20539 | 0.473718 | 0.279832 |
rowsum(explxbl)
4-element Vector{Float64}: 0.8735162349324337 0.9211580695715044 0.9505211974172872 0.9589401500789885
Contribution of each block to the global scores
contrxbl2t = res.contrxbl2t
Row | lv1 | lv2 | lv3 |
---|---|---|---|
Float64 | Float64 | Float64 | |
1 | 0.288987 | 0.020043 | 0.239962 |
2 | 0.286206 | 0.0695533 | 0.27136 |
3 | 0.33627 | 0.103227 | 0.137751 |
4 | 0.0885361 | 0.807177 | 0.350927 |
colsum(contrxbl2t)
3-element Vector{Float64}: 0.9999999999999999 1.0000000000000002 1.0000000000000002
z = contrxbl2t i = 1 f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = string("PC", i), ylabel = string("PC", i + 1)) text!(ax, z[:, i], z[:, i + 1]; text = string.("j", 1:nbl), fontsize = 15) xlims!(ax, [0, .4]) ylims!(ax, [0, .9]) f
RV coefficients between each block and the global LVs
res.rvxbl2t
Row | lv1 | lv2 | lv3 |
---|---|---|---|
Float64 | Float64 | Float64 | |
1 | 0.867627 | 0.0152298 | 0.247626 |
2 | 0.895813 | 0.0550478 | 0.291952 |
3 | 0.943026 | 0.0732225 | 0.132785 |
4 | 0.313562 | 0.723312 | 0.42727 |
Correlation between the X-variables and the global scores
corx2t = res.corx2t
Row | lv1 | lv2 | lv3 |
---|---|---|---|
Float64 | Float64 | Float64 | |
1 | -0.835675 | 0.280847 | 0.14693 |
2 | 0.873758 | 0.0923135 | 0.414077 |
3 | 0.601193 | -0.0347772 | -0.773383 |
4 | -0.826089 | 0.107675 | -0.095143 |
5 | -0.8111 | -0.163856 | 0.393378 |
6 | 0.880278 | -0.0891491 | 0.404286 |
7 | -0.678057 | -0.336407 | 0.602167 |
8 | -0.931598 | 0.165772 | 0.170208 |
9 | -0.983061 | -0.0743055 | -0.0554282 |
10 | 0.792075 | -0.448744 | 0.385157 |
11 | -0.665309 | -0.277749 | 0.652515 |
12 | -0.778559 | -0.102133 | 0.56566 |
13 | 0.0585256 | 0.765167 | 0.614377 |
14 | 0.229752 | 0.872646 | -0.395764 |
z = corx2t nvar = nro(z) i = 1 f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = string("PC", i), ylabel = string("PC", i + 1)) text!(ax, z[:, i], z[:, i + 1]; text = string.("v", 1:nvar), fontsize = 15) xlims!(ax, [-1, 1]) ylims!(ax, [-1, 1]) f
Block k in the global score space
k = 4 v = Vbl[k] nam = names(Xbl[k]) i = 1 f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, xlabel = string("PC", i), ylabel = string("PC", i + 1)) text!(ax, T[:, i], T[:, i + 1]; text = string.("w", 1:n), fontsize = 15) scatter!(ax, v[:, i], v[:, i + 1]) text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) ylims!(ax, [-1, 1]) f
Block k in the block score space
k = 4 # fourth block z = Tbl[k] v = Vbl[k] nam = names(Xbl[k]) i = 1 f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = string("PC", i), ylabel = string("PC", i + 1), title = string("Block ", k)) text!(ax, z[:, i], z[:, i + 1]; text = string.("w", 1:n), fontsize = 15) scatter!(ax, v[:, i], v[:, i + 1]) text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) ylims!(ax, [-1, 1]) f
Judge 4 was atypical from the other judges and determined the global score 2. In contrast to judges 1-3, judge 4 gave particularly high ratings to wines w6 and w8 (variable v13: expenses, and variable v14: brilliant).