This note replicates the analysis of the wines sensory dataset presented by Hanafi & Quanari (2008)
Hanafi, M., Qannari, E.M., 2008. Nouvelles propriétés de l’analyse en composantes communes et poids spécifiques. Journal de la société française de statistique 149, 75–97
where the ComDim method (Common components and specific weights analysis, aka CCSWA) was used to explore the data.
Wines sensory dataset: A jury of four expert tasters ("judges") evaluated the appearance of eight wines according to the procedure known as free profile. Each judge noted on a scale from 0 to 10 the products according to his/her own variables. For a product and a given variable, the note given by a judge was the intensity on which he/she perceives the variable. Each matrix Xk (k = 1, ..., 4) was associated to one judge.
The goal of the ComDim study was to evaluate if there was an agreement between judges (or groups of judges), and to evaluate the relationships potentially existing among the eight wines.
using Jchemo, JchemoData using JLD2, CairoMakie
path_jdat = dirname(dirname(pathof(JchemoData))) db = joinpath(path_jdat, "data/wines_sensory.jld2") @load db dat @names dat
(:X1, :X2, :X3, :X4)
X1 = dat.X1 X2 = dat.X2 X3 = dat.X3 X4 = dat.X4 Xbl = [X1, X2, X3, X4] # Same as: Xbl = [[X1]; [X2]; [X3]; [X4]]
4-element Vector{DataFrames.DataFrame}:
8×4 DataFrame
Row │ red gilded soft plum
│ Float64 Float64 Float64 Float64
─────┼────────────────────────────────────
1 │ 7.0 0.0 5.0 8.0
2 │ 5.0 6.0 6.0 3.0
3 │ 7.0 2.0 5.0 5.0
4 │ 5.0 7.0 7.0 4.0
5 │ 5.0 7.0 6.0 4.0
6 │ 6.0 8.0 6.0 1.0
7 │ 5.0 4.0 10.0 3.0
8 │ 6.0 6.0 6.0 5.0
8×3 DataFrame
Row │ ruby coloured intensity
│ Float64 Float64 Float64
─────┼──────────────────────────────
1 │ 4.0 0.0 5.0
2 │ 3.0 6.0 5.0
3 │ 3.0 3.0 7.0
4 │ 1.0 6.0 3.0
5 │ 2.0 5.0 5.0
6 │ 1.0 5.0 4.0
7 │ 0.0 4.0 2.0
8 │ 2.0 6.0 4.0
8×4 DataFrame
Row │ red blue gilded intensity
│ Float64 Float64 Float64 Float64
─────┼──────────────────────────────────────
1 │ 7.0 4.0 2.0 6.0
2 │ 2.0 0.0 6.0 6.0
3 │ 6.0 3.0 4.0 7.0
4 │ 2.0 0.0 6.0 4.0
5 │ 5.0 1.0 5.0 6.0
6 │ 3.0 0.0 5.0 5.0
7 │ 2.0 0.0 4.0 3.0
8 │ 4.0 0.0 4.0 5.0
8×3 DataFrame
Row │ deep expenses brilliant
│ Float64 Float64 Float64
─────┼──────────────────────────────
1 │ 9.0 7.0 9.0
2 │ 8.0 6.0 7.0
3 │ 10.0 6.0 7.0
4 │ 7.0 7.0 8.0
5 │ 8.0 7.0 8.0
6 │ 8.0 8.0 10.0
7 │ 6.0 5.0 10.0
8 │ 8.0 9.0 10.0
n = nro(X1) nbl = length(Xbl)
4
In the ComDim analysis below, each block is scaled by its Frobenius norm (to equalize the group importances in regards to their total inertia), and three global components (scores) are computed
nlv = 3 bscal = :frob model = comdim(; nlv, bscal) fit!(model, Xbl) fitm = model.fitm @names fitm
(:T, :U, :W, :Tb, :Tbl, :Vbl, :lb, :mu, :fitm_bl, :weights, :niter, :par)
Global scores
@head T = fitm.T
3×3 Matrix{Float64}:
-2.50761 0.335161 -0.297369
0.644503 -0.692286 0.247392
-1.564 -0.540373 0.219861
... (8, 3)
i = 1 # define the pair of successive components f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, xlabel = "PC$i", ylabel = "PC$(i + 1)") text!(ax, T[:, i], T[:, i + 1]; text = string.("wi", 1:n), fontsize = 15) f
Block scores
Tbl = fitm.Tbl
4-element Vector{Matrix}:
[-1.7962627913744986 -0.008918615631455138 -0.22665348117393225; 0.3879039556775733 -0.1065661506250574 0.2127782058108359; … ; 0.30099461206273337 -0.585749646663945 -1.0867198878257494; 0.035545038805747414 0.3417179548041017 0.07458461012188208]
[-1.7044871814196665 0.5357943376354091 -0.4170589964322815; 0.1995449497670047 -0.6605733548084997 0.400922304133755; … ; 0.6150299861716623 0.9086489000953766 -1.0021145160270302; 0.5308334812361191 -0.13762360842297028 0.5902444200693617]
[-1.6933885952738224 0.2296343453843308 -0.2616132417652677; 0.7329488213391452 -0.5749914914629086 0.17354614604739924; … ; 0.7929766870562747 0.48867740320692926 -0.7455120724195884; 0.12923331248947625 0.32227484677350104 0.1879793984911528]
[-0.40801692432226466 0.46884767588169035 -0.36141026079567345; -0.2773904635616495 -0.9746086560813164 0.26113408530961424; … ; 1.0769677647773832 -0.11436320965711218 -1.2895662379308004; 0.2856187372760472 1.1470867080820784 0.436153894141458]
Block loadings
Vbl = fitm.Vbl
4-element Vector{Matrix}:
[-0.23221350649942402 0.5891168732065994 0.07642515391317986; 0.7607464762538525 0.6064931208295313 0.674847582189589; 0.3011913315908775 -0.13118586846138441 -0.7251746631780844; -0.5259519636251038 0.5175883229032195 -0.11340919075085852]
[-0.4537703469097276 -0.3697957744699179 0.38548166419002783; 0.7769886790381423 -0.31753284623159983 0.625096894897853; -0.4363267868375587 -0.8731689279562419 0.6787177311389095]
[-0.6612798979736141 0.4222002594039563 0.3219773142067391; -0.5710752899312963 -0.15488207530536102 -0.08580550040851531; 0.37567655711660647 -0.7638562292199929 0.48684592681033495; -0.30894827107285366 -0.46290619437435704 0.8074336311945677]
[-0.9493374803550765 -0.08204958567684245 0.5908887482427636; 0.0743842707357129 0.6405670458917806 0.6692544240608542; 0.3053282310239113 0.7635062050879062 -0.45049861606343977]
For block k
k = 1 Tbl[k]
8×3 Matrix{Float64}:
-1.79626 -0.00891862 -0.226653
0.387904 -0.106566 0.212778
-0.945808 -0.110159 0.124653
0.534977 0.161467 -0.00130687
0.452331 0.206683 0.432083
1.03032 0.101524 0.470582
0.300995 -0.58575 -1.08672
0.035545 0.341718 0.0745846
Vbl[k]
4×3 Matrix{Float64}:
-0.232214 0.589117 0.0764252
0.760746 0.606493 0.674848
0.301191 -0.131186 -0.725175
-0.525952 0.517588 -0.113409
res = summary(model, Xbl) @names res
(:explvarx, :explvarxx, :explxbl, :psal2, :contrxbl2t, :rvxbl2t, :rdxbl2t, :cortbl2t, :corx2t)
Proportion of the total XX' inertia explained by the global scores
res.explvarxx
| Row | lv | var | pvar | cumpvar |
|---|---|---|---|---|
| Int64 | Float64 | Float64 | Float64 | |
| 1 | 1 | 1.54101 | 0.681971 | 0.681971 |
| 2 | 2 | 0.229883 | 0.101734 | 0.783705 |
| 3 | 3 | 0.173808 | 0.0769181 | 0.860623 |
Proportion of the inertia of each block (= Xbl[k]) explained by the global scores
explxbl = res.explxbl # = specific weights 'lb' when 'bscal = :frob'
| Row | lv1 | lv2 | lv3 |
|---|---|---|---|
| Float64 | Float64 | Float64 | |
| 1 | 0.670406 | 0.0117629 | 0.191347 |
| 2 | 0.663954 | 0.0408196 | 0.216385 |
| 3 | 0.780095 | 0.0605821 | 0.109844 |
| 4 | 0.20539 | 0.473718 | 0.279832 |
rowsum(explxbl)
4-element Vector{Float64}:
0.8735162349324337
0.9211580695715044
0.9505211974172872
0.9589401500789885
Contribution of each block to the global scores
contrxbl2t = res.contrxbl2t
| Row | lv1 | lv2 | lv3 |
|---|---|---|---|
| Float64 | Float64 | Float64 | |
| 1 | 0.288987 | 0.020043 | 0.239962 |
| 2 | 0.286206 | 0.0695533 | 0.27136 |
| 3 | 0.33627 | 0.103227 | 0.137751 |
| 4 | 0.0885361 | 0.807177 | 0.350927 |
colsum(contrxbl2t)
3-element Vector{Float64}:
0.9999999999999999
1.0000000000000002
1.0000000000000002
z = contrxbl2t i = 1 f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = "PC$i", ylabel = "PC$(i + 1)") text!(ax, z[:, i], z[:, i + 1]; text = string.("ju", 1:nbl), fontsize = 15) xlims!(ax, [0, .4]) ylims!(ax, [0, .9]) f
RV coefficients between each block and the global LVs
res.rvxbl2t
| Row | lv1 | lv2 | lv3 |
|---|---|---|---|
| Float64 | Float64 | Float64 | |
| 1 | 0.867627 | 0.0152298 | 0.247626 |
| 2 | 0.895813 | 0.0550478 | 0.291952 |
| 3 | 0.943026 | 0.0732225 | 0.132785 |
| 4 | 0.313562 | 0.723312 | 0.42727 |
Correlation between the X-variables and the global scores
corx2t = res.corx2t
| Row | lv1 | lv2 | lv3 |
|---|---|---|---|
| Float64 | Float64 | Float64 | |
| 1 | -0.835675 | 0.280847 | 0.14693 |
| 2 | 0.873758 | 0.0923135 | 0.414077 |
| 3 | 0.601193 | -0.0347772 | -0.773383 |
| 4 | -0.826089 | 0.107675 | -0.095143 |
| 5 | -0.8111 | -0.163856 | 0.393378 |
| 6 | 0.880278 | -0.0891491 | 0.404286 |
| 7 | -0.678057 | -0.336407 | 0.602167 |
| 8 | -0.931598 | 0.165772 | 0.170208 |
| 9 | -0.983061 | -0.0743055 | -0.0554282 |
| 10 | 0.792075 | -0.448744 | 0.385157 |
| 11 | -0.665309 | -0.277749 | 0.652515 |
| 12 | -0.778559 | -0.102133 | 0.56566 |
| 13 | 0.0585256 | 0.765167 | 0.614377 |
| 14 | 0.229752 | 0.872646 | -0.395764 |
z = corx2t nvar = nro(z) i = 1 f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = "PC$i", ylabel = "PC$(i + 1)") text!(ax, z[:, i], z[:, i + 1]; text = string.("va", 1:nvar), fontsize = 15) xlims!(ax, [-1, 1]) ylims!(ax, [-1, 1]) f
Block k in the global score space
k = 4 v = Vbl[k] nam = names(Xbl[k]) i = 1 f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, title = "Block$k", xlabel = "PC$i", ylabel = "PC$(i + 1)") text!(ax, T[:, i], T[:, i + 1]; text = string.("wi", 1:n), fontsize = 15) scatter!(ax, v[:, i], v[:, i + 1]) text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) ylims!(ax, [-1, 1]) f
Block k in the block score space
k = 4 # example of the fourth block z = Tbl[k] v = Vbl[k] nam = names(Xbl[k]) i = 1 f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, title = "Block$k", xlabel = "PC$i", ylabel = "PC$(i + 1)") text!(ax, z[:, i], z[:, i + 1]; text = string.("wi", 1:n), fontsize = 15) scatter!(ax, v[:, i], v[:, i + 1]) text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) ylims!(ax, [-1, 1]) f
Judge 4 was atypical from the other judges and determined the global score 2. In contrast to judges 1-2-3, judge 4 gave high ratings to wines wi6 and wi8 (variables v13: expenses v14: brilliant).