ComDim - wines_sensory

This note replicates the analysis of the wines sensory dataset presented by Hanafi & Quanari (2008)

Hanafi, M., Qannari, E.M., 2008. Nouvelles propriétés de l’analyse en composantes communes et poids spécifiques. Journal de la société française de statistique 149, 75–97

where the ComDim method (Common components and specific weights analysis, aka CCSWA) was used to explore the data.

Wines sensory dataset: A jury of four expert tasters ("judges") evaluated the appearance of eight wines according to the procedure known as free profile. Each judge noted on a scale from 0 to 10 the products according to his/her own variables. For a product and a given variable, the note given by a judge was the intensity on which he/she perceives the variable. Each matrix Xk (k = 1, ..., 4) was associated to one judge.

The goal of the ComDim study was to evaluate if there was an agreement between judges (or groups of judges), and to evaluate the relationships potentially existing among the eight wines.

Data importation

using Jchemo, JchemoData
using JLD2, CairoMakie

path_jdat = dirname(dirname(pathof(JchemoData)))
db = joinpath(path_jdat, "data/wines_sensory.jld2")
@load db dat
@names dat

(:X1, :X2, :X3, :X4)

X1 = dat.X1 
X2 = dat.X2
X3 = dat.X3
X4 = dat.X4
Xbl = [X1, X2, X3, X4]   # Same as: Xbl = [[X1]; [X2]; [X3]; [X4]]

4-element Vector{DataFrames.DataFrame}:
 8×4 DataFrame
 Row │ red      gilded   soft     plum
     │ Float64  Float64  Float64  Float64
─────┼────────────────────────────────────
   1 │     7.0      0.0      5.0      8.0
   2 │     5.0      6.0      6.0      3.0
   3 │     7.0      2.0      5.0      5.0
   4 │     5.0      7.0      7.0      4.0
   5 │     5.0      7.0      6.0      4.0
   6 │     6.0      8.0      6.0      1.0
   7 │     5.0      4.0     10.0      3.0
   8 │     6.0      6.0      6.0      5.0
 8×3 DataFrame
 Row │ ruby     coloured  intensity
     │ Float64  Float64   Float64
─────┼──────────────────────────────
   1 │     4.0       0.0        5.0
   2 │     3.0       6.0        5.0
   3 │     3.0       3.0        7.0
   4 │     1.0       6.0        3.0
   5 │     2.0       5.0        5.0
   6 │     1.0       5.0        4.0
   7 │     0.0       4.0        2.0
   8 │     2.0       6.0        4.0
 8×4 DataFrame
 Row │ red      blue     gilded   intensity
     │ Float64  Float64  Float64  Float64
─────┼──────────────────────────────────────
   1 │     7.0      4.0      2.0        6.0
   2 │     2.0      0.0      6.0        6.0
   3 │     6.0      3.0      4.0        7.0
   4 │     2.0      0.0      6.0        4.0
   5 │     5.0      1.0      5.0        6.0
   6 │     3.0      0.0      5.0        5.0
   7 │     2.0      0.0      4.0        3.0
   8 │     4.0      0.0      4.0        5.0
 8×3 DataFrame
 Row │ deep     expenses  brilliant
     │ Float64  Float64   Float64
─────┼──────────────────────────────
   1 │     9.0       7.0        9.0
   2 │     8.0       6.0        7.0
   3 │    10.0       6.0        7.0
   4 │     7.0       7.0        8.0
   5 │     8.0       7.0        8.0
   6 │     8.0       8.0       10.0
   7 │     6.0       5.0       10.0
   8 │     8.0       9.0       10.0

n = nro(X1)
nbl = length(Xbl)

Model fitting

In the ComDim analysis below, each block is scaled by its Frobenius norm (to equalize the group importances in regards to their total inertia), and three global components (scores) are computed

nlv = 3
bscal = :frob
model = comdim(; nlv, bscal)
fit!(model, Xbl)
fitm = model.fitm
@names fitm

(:T, :U, :W, :Tb, :Tbl, :Vbl, :lb, :mu, :fitm_bl, :weights, :niter, :par)

Global scores

@head T = fitm.T

3×3 Matrix{Float64}:
 -2.50761    0.335161  -0.297369
  0.644503  -0.692286   0.247392
 -1.564     -0.540373   0.219861
... (8, 3)

i = 1  # define the pair of successive components 
f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, xlabel = "PC$i", ylabel = "PC$(i + 1)") 
text!(ax, T[:, i], T[:, i + 1]; text = string.("wi", 1:n), fontsize = 15)
f

Block scores

Tbl = fitm.Tbl

4-element Vector{Matrix}:
 [-1.7962627913744986 -0.008918615631455138 -0.22665348117393225; 0.3879039556775733 -0.1065661506250574 0.2127782058108359; … ; 0.30099461206273337 -0.585749646663945 -1.0867198878257494; 0.035545038805747414 0.3417179548041017 0.07458461012188208]
 [-1.7044871814196665 0.5357943376354091 -0.4170589964322815; 0.1995449497670047 -0.6605733548084997 0.400922304133755; … ; 0.6150299861716623 0.9086489000953766 -1.0021145160270302; 0.5308334812361191 -0.13762360842297028 0.5902444200693617]
 [-1.6933885952738224 0.2296343453843308 -0.2616132417652677; 0.7329488213391452 -0.5749914914629086 0.17354614604739924; … ; 0.7929766870562747 0.48867740320692926 -0.7455120724195884; 0.12923331248947625 0.32227484677350104 0.1879793984911528]
 [-0.40801692432226466 0.46884767588169035 -0.36141026079567345; -0.2773904635616495 -0.9746086560813164 0.26113408530961424; … ; 1.0769677647773832 -0.11436320965711218 -1.2895662379308004; 0.2856187372760472 1.1470867080820784 0.436153894141458]

Block loadings

Vbl = fitm.Vbl

4-element Vector{Matrix}:
 [-0.23221350649942402 0.5891168732065994 0.07642515391317986; 0.7607464762538525 0.6064931208295313 0.674847582189589; 0.3011913315908775 -0.13118586846138441 -0.7251746631780844; -0.5259519636251038 0.5175883229032195 -0.11340919075085852]
 [-0.4537703469097276 -0.3697957744699179 0.38548166419002783; 0.7769886790381423 -0.31753284623159983 0.625096894897853; -0.4363267868375587 -0.8731689279562419 0.6787177311389095]
 [-0.6612798979736141 0.4222002594039563 0.3219773142067391; -0.5710752899312963 -0.15488207530536102 -0.08580550040851531; 0.37567655711660647 -0.7638562292199929 0.48684592681033495; -0.30894827107285366 -0.46290619437435704 0.8074336311945677]
 [-0.9493374803550765 -0.08204958567684245 0.5908887482427636; 0.0743842707357129 0.6405670458917806 0.6692544240608542; 0.3053282310239113 0.7635062050879062 -0.45049861606343977]

For block k

k = 1
Tbl[k]

8×3 Matrix{Float64}:
 -1.79626   -0.00891862  -0.226653
  0.387904  -0.106566     0.212778
 -0.945808  -0.110159     0.124653
  0.534977   0.161467    -0.00130687
  0.452331   0.206683     0.432083
  1.03032    0.101524     0.470582
  0.300995  -0.58575     -1.08672
  0.035545   0.341718     0.0745846

Vbl[k]

4×3 Matrix{Float64}:
 -0.232214   0.589117   0.0764252
  0.760746   0.606493   0.674848
  0.301191  -0.131186  -0.725175
 -0.525952   0.517588  -0.113409

Summary outputs

res = summary(model, Xbl)
@names res

(:explvarx, :explvarxx, :explxbl, :psal2, :contrxbl2t, :rvxbl2t, :rdxbl2t, :cortbl2t, :corx2t)

Proportion of the total XX' inertia explained by the global scores

res.explvarxx

3×4 DataFrame

Row	lv	var	pvar	cumpvar
	Int64	Float64	Float64	Float64
1	1	1.54101	0.681971	0.681971
2	2	0.229883	0.101734	0.783705
3	3	0.173808	0.0769181	0.860623

Proportion of the inertia of each block (= Xbl[k]) explained by the global scores

explxbl = res.explxbl  # = specific weights 'lb' when 'bscal = :frob'

4×3 DataFrame

Row	lv1	lv2	lv3
	Float64	Float64	Float64
1	0.670406	0.0117629	0.191347
2	0.663954	0.0408196	0.216385
3	0.780095	0.0605821	0.109844
4	0.20539	0.473718	0.279832

rowsum(explxbl)

4-element Vector{Float64}:
 0.8735162349324337
 0.9211580695715044
 0.9505211974172872
 0.9589401500789885

Contribution of each block to the global scores

contrxbl2t = res.contrxbl2t

4×3 DataFrame

Row	lv1	lv2	lv3
	Float64	Float64	Float64
1	0.288987	0.020043	0.239962
2	0.286206	0.0695533	0.27136
3	0.33627	0.103227	0.137751
4	0.0885361	0.807177	0.350927

colsum(contrxbl2t)

3-element Vector{Float64}:
 0.9999999999999999
 1.0000000000000002
 1.0000000000000002

z = contrxbl2t
i = 1
f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = "PC$i", ylabel = "PC$(i + 1)") 
text!(ax, z[:, i], z[:, i + 1]; text = string.("ju", 1:nbl), fontsize = 15) 
xlims!(ax, [0, .4]) 
ylims!(ax, [0, .9]) 
f

RV coefficients between each block and the global LVs

res.rvxbl2t

4×3 DataFrame

Row	lv1	lv2	lv3
	Float64	Float64	Float64
1	0.867627	0.0152298	0.247626
2	0.895813	0.0550478	0.291952
3	0.943026	0.0732225	0.132785
4	0.313562	0.723312	0.42727

Correlation between the X-variables and the global scores

corx2t = res.corx2t

14×3 DataFrame

Row	lv1	lv2	lv3
	Float64	Float64	Float64
1	-0.835675	0.280847	0.14693
2	0.873758	0.0923135	0.414077
3	0.601193	-0.0347772	-0.773383
4	-0.826089	0.107675	-0.095143
5	-0.8111	-0.163856	0.393378
6	0.880278	-0.0891491	0.404286
7	-0.678057	-0.336407	0.602167
8	-0.931598	0.165772	0.170208
9	-0.983061	-0.0743055	-0.0554282
10	0.792075	-0.448744	0.385157
11	-0.665309	-0.277749	0.652515
12	-0.778559	-0.102133	0.56566
13	0.0585256	0.765167	0.614377
14	0.229752	0.872646	-0.395764

z = corx2t 
nvar = nro(z) 
i = 1
f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = "PC$i", ylabel = "PC$(i + 1)") 
text!(ax, z[:, i], z[:, i + 1]; text = string.("va", 1:nvar), fontsize = 15) 
xlims!(ax, [-1, 1]) 
ylims!(ax, [-1, 1]) 
f

Simultaneous representation of observations and loadings

Block k in the global score space

k = 4
v = Vbl[k]
nam = names(Xbl[k])
i = 1
f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, title = "Block$k", xlabel = "PC$i", ylabel = "PC$(i + 1)") 
text!(ax, T[:, i], T[:, i + 1]; text = string.("wi", 1:n), fontsize = 15) 
scatter!(ax, v[:, i], v[:, i + 1]) 
text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) 
ylims!(ax, [-1, 1]) 
f

Block k in the block score space

k = 4  # example of the fourth block
z = Tbl[k] 
v = Vbl[k] 
nam = names(Xbl[k]) 
i = 1
f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, title = "Block$k", xlabel = "PC$i", ylabel = "PC$(i + 1)") 
text!(ax, z[:, i], z[:, i + 1]; text = string.("wi", 1:n), fontsize = 15) 
scatter!(ax, v[:, i], v[:, i + 1]) 
text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) 
ylims!(ax, [-1, 1]) 
f

Main conclusion of the study

Judge 4 was atypical from the other judges and determined the global score 2. In contrast to judges 1-2-3, judge 4 gave high ratings to wines wi6 and wi8 (variables v13: expenses v14: brilliant).