ComDim - wines_sensory - Hanafi2008

This note replicates the analysis of the wines sensory dataset presented by Hanafi & Quanari (2008):

Hanafi, M., Qannari, E.M., 2008. Nouvelles propriétés de l’analyse en composantes communes et poids spécifiques. Journal de la société française de statistique 149, 75–97

where the ComDim method (Common components and specific weights analysis; aka CCSWA) is used to explore the data.

Wines sensory dataset: A jury of four expert tasters ("judges") evaluated the appearance of eight wines according to the procedure known as free profile. Each judge notes on a scale from 0 to 10 the products according to his/her own variables. For a product and a given variable, the note given by a judge is the intensity on which he/she perceives the variable. Each matrix Xk (k = 1, ..., 4) is associated to one judge.

The goal of the ComDim study was to evaluate if there was an agreement between judges or groups of judges and to assess the relationships among the eight wines.

Data importation

using Jchemo, JchemoData
using JLD2, CairoMakie

path_jdat = dirname(dirname(pathof(JchemoData)))
db = joinpath(path_jdat, "data/wines_sensory.jld2")
@load db dat
@names dat

(:X1, :X2, :X3, :X4)

X1 = dat.X1 
X2 = dat.X2
X3 = dat.X3
X4 = dat.X4
Xbl = [X1, X2, X3, X4]   # Same as: Xbl = [[X1]; [X2]; [X3]; [X4]]

4-element Vector{DataFrame}:
 8×4 DataFrame
 Row │ red      gilded   soft     plum
     │ Float64  Float64  Float64  Float64
─────┼────────────────────────────────────
   1 │     7.0      0.0      5.0      8.0
   2 │     5.0      6.0      6.0      3.0
   3 │     7.0      2.0      5.0      5.0
   4 │     5.0      7.0      7.0      4.0
   5 │     5.0      7.0      6.0      4.0
   6 │     6.0      8.0      6.0      1.0
   7 │     5.0      4.0     10.0      3.0
   8 │     6.0      6.0      6.0      5.0
 8×3 DataFrame
 Row │ ruby     coloured  intensity
     │ Float64  Float64   Float64
─────┼──────────────────────────────
   1 │     4.0       0.0        5.0
   2 │     3.0       6.0        5.0
   3 │     3.0       3.0        7.0
   4 │     1.0       6.0        3.0
   5 │     2.0       5.0        5.0
   6 │     1.0       5.0        4.0
   7 │     0.0       4.0        2.0
   8 │     2.0       6.0        4.0
 8×4 DataFrame
 Row │ red      blue     gilded   intensity
     │ Float64  Float64  Float64  Float64
─────┼──────────────────────────────────────
   1 │     7.0      4.0      2.0        6.0
   2 │     2.0      0.0      6.0        6.0
   3 │     6.0      3.0      4.0        7.0
   4 │     2.0      0.0      6.0        4.0
   5 │     5.0      1.0      5.0        6.0
   6 │     3.0      0.0      5.0        5.0
   7 │     2.0      0.0      4.0        3.0
   8 │     4.0      0.0      4.0        5.0
 8×3 DataFrame
 Row │ deep     expenses  brilliant
     │ Float64  Float64   Float64
─────┼──────────────────────────────
   1 │     9.0       7.0        9.0
   2 │     8.0       6.0        7.0
   3 │    10.0       6.0        7.0
   4 │     7.0       7.0        8.0
   5 │     8.0       7.0        8.0
   6 │     8.0       8.0       10.0
   7 │     6.0       5.0       10.0
   8 │     8.0       9.0       10.0

n = nro(X1)
nbl = length(Xbl)

Model fitting

Below, each block is scaled by its Frobenius norm, and three global components (scores) are computed

nlv = 3
bscal = :frob
model = comdim(; nlv, bscal)
fit!(model, Xbl)
fitm = model.fitm
@names fitm

(:T, :U, :W, :Tb, :Tbl, :Vbl, :lb, :mu, :fitmbl, :weights, :niter, :par)

Global scores

@head T = fitm.T

3×3 Matrix{Float64}:
 -2.50761    0.335161  -0.297369
  0.644503  -0.692286   0.247392
 -1.564     -0.540373   0.219861
... (8, 3)

i = 1 
f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, xlabel = string("PC", i), ylabel = string("PC", i + 1)) 
text!(ax, T[:, i], T[:, i + 1]; text = string.("w", 1:n), fontsize = 15)
f

Block scores and block loadings

Tbl = fitm.Tbl

4-element Vector{Matrix}:
 [-1.7962627913744986 -0.008918615631455138 -0.22665348117393225; 0.3879039556775733 -0.1065661506250574 0.2127782058108359; … ; 0.30099461206273337 -0.585749646663945 -1.0867198878257494; 0.035545038805747414 0.3417179548041017 0.07458461012188208]
 [-1.7044871814196665 0.5357943376354091 -0.4170589964322815; 0.1995449497670047 -0.6605733548084997 0.400922304133755; … ; 0.6150299861716623 0.9086489000953766 -1.0021145160270302; 0.5308334812361191 -0.13762360842297028 0.5902444200693617]
 [-1.6933885952738224 0.2296343453843308 -0.2616132417652677; 0.7329488213391452 -0.5749914914629086 0.17354614604739924; … ; 0.7929766870562747 0.48867740320692926 -0.7455120724195884; 0.12923331248947625 0.32227484677350104 0.1879793984911528]
 [-0.40801692432226466 0.46884767588169035 -0.36141026079567345; -0.2773904635616495 -0.9746086560813164 0.26113408530961424; … ; 1.0769677647773832 -0.11436320965711218 -1.2895662379308004; 0.2856187372760472 1.1470867080820784 0.436153894141458]

Vbl = fitm.Vbl

4-element Vector{Matrix}:
 [-0.23221350649942402 0.5891168732065994 0.07642515391317986; 0.7607464762538525 0.6064931208295313 0.674847582189589; 0.3011913315908775 -0.13118586846138441 -0.7251746631780844; -0.5259519636251038 0.5175883229032195 -0.11340919075085852]
 [-0.4537703469097276 -0.3697957744699179 0.38548166419002783; 0.7769886790381423 -0.31753284623159983 0.625096894897853; -0.4363267868375587 -0.8731689279562419 0.6787177311389095]
 [-0.6612798979736141 0.4222002594039563 0.3219773142067391; -0.5710752899312963 -0.15488207530536102 -0.08580550040851531; 0.37567655711660647 -0.7638562292199929 0.48684592681033495; -0.30894827107285366 -0.46290619437435704 0.8074336311945677]
 [-0.9493374803550765 -0.08204958567684245 0.5908887482427636; 0.0743842707357129 0.6405670458917806 0.6692544240608542; 0.3053282310239113 0.7635062050879062 -0.45049861606343977]

For block k

k = 1
Tbl[k]

8×3 Matrix{Float64}:
 -1.79626   -0.00891862  -0.226653
  0.387904  -0.106566     0.212778
 -0.945808  -0.110159     0.124653
  0.534977   0.161467    -0.00130687
  0.452331   0.206683     0.432083
  1.03032    0.101524     0.470582
  0.300995  -0.58575     -1.08672
  0.035545   0.341718     0.0745846

Vbl[k]

4×3 Matrix{Float64}:
 -0.232214   0.589117   0.0764252
  0.760746   0.606493   0.674848
  0.301191  -0.131186  -0.725175
 -0.525952   0.517588  -0.113409

Summary outputs

res = summary(model, Xbl)
@names res

(:explvarx, :explvarxx, :explxbl, :psal2, :contrxbl2t, :rvxbl2t, :rdxbl2t, :cortbl2t, :corx2t)

Proportion of the total XX' inertia explained by the global scores

res.explvarxx

3×4 DataFrame

Row	lv	var	pvar	cumpvar
	Int64	Float64	Float64	Float64
1	1	1.54101	0.681971	0.681971
2	2	0.229883	0.101734	0.783705
3	3	0.173808	0.0769181	0.860623

Proportion of the inertia of each block (= Xbl[k]) explained by the global scores

explxbl = res.explxbl  # = specific weights 'lb' when 'bscal = :frob'

4×3 DataFrame

Row	lv1	lv2	lv3
	Float64	Float64	Float64
1	0.670406	0.0117629	0.191347
2	0.663954	0.0408196	0.216385
3	0.780095	0.0605821	0.109844
4	0.20539	0.473718	0.279832

rowsum(explxbl)

4-element Vector{Float64}:
 0.8735162349324337
 0.9211580695715044
 0.9505211974172872
 0.9589401500789885

Contribution of each block to the global scores

contrxbl2t = res.contrxbl2t

4×3 DataFrame

Row	lv1	lv2	lv3
	Float64	Float64	Float64
1	0.288987	0.020043	0.239962
2	0.286206	0.0695533	0.27136
3	0.33627	0.103227	0.137751
4	0.0885361	0.807177	0.350927

colsum(contrxbl2t)

3-element Vector{Float64}:
 0.9999999999999999
 1.0000000000000002
 1.0000000000000002

z = contrxbl2t
i = 1
f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = string("PC", i), 
    ylabel = string("PC", i + 1)) 
text!(ax, z[:, i], z[:, i + 1]; text = string.("j", 1:nbl), fontsize = 15) 
xlims!(ax, [0, .4]) 
ylims!(ax, [0, .9]) 
f

RV coefficients between each block and the global LVs

res.rvxbl2t

4×3 DataFrame

Row	lv1	lv2	lv3
	Float64	Float64	Float64
1	0.867627	0.0152298	0.247626
2	0.895813	0.0550478	0.291952
3	0.943026	0.0732225	0.132785
4	0.313562	0.723312	0.42727

Correlation between the X-variables and the global scores

corx2t = res.corx2t

14×3 DataFrame

Row	lv1	lv2	lv3
	Float64	Float64	Float64
1	-0.835675	0.280847	0.14693
2	0.873758	0.0923135	0.414077
3	0.601193	-0.0347772	-0.773383
4	-0.826089	0.107675	-0.095143
5	-0.8111	-0.163856	0.393378
6	0.880278	-0.0891491	0.404286
7	-0.678057	-0.336407	0.602167
8	-0.931598	0.165772	0.170208
9	-0.983061	-0.0743055	-0.0554282
10	0.792075	-0.448744	0.385157
11	-0.665309	-0.277749	0.652515
12	-0.778559	-0.102133	0.56566
13	0.0585256	0.765167	0.614377
14	0.229752	0.872646	-0.395764

z = corx2t 
nvar = nro(z) 
i = 1
f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = string("PC", i), 
    ylabel = string("PC", i + 1)) 
text!(ax, z[:, i], z[:, i + 1]; text = string.("v", 1:nvar), fontsize = 15) 
xlims!(ax, [-1, 1]) 
ylims!(ax, [-1, 1]) 
f

Simultaneous representation of observations and loadings

Block k in the global score space

k = 4
v = Vbl[k]
nam = names(Xbl[k])
i = 1
f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, xlabel = string("PC", i), 
    ylabel = string("PC", i + 1)) 
text!(ax, T[:, i], T[:, i + 1]; text = string.("w", 1:n), fontsize = 15) 
scatter!(ax, v[:, i], v[:, i + 1]) 
text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) 
ylims!(ax, [-1, 1]) 
f

Block k in the block score space

k = 4  # fourth block
z = Tbl[k] 
v = Vbl[k] 
nam = names(Xbl[k]) 
i = 1
f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = string("PC", i), 
    ylabel = string("PC", i + 1), title = string("Block ", k)) 
text!(ax, z[:, i], z[:, i + 1]; text = string.("w", 1:n), fontsize = 15) 
scatter!(ax, v[:, i], v[:, i + 1]) 
text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) 
ylims!(ax, [-1, 1]) 
f

Main conclusion

Judge 4 was atypical from the other judges and determined the global score 2. In contrast to judges 1-3, judge 4 gave particularly high ratings to wines w6 and w8 (variable v13: expenses, and variable v14: brilliant).