ComDim - wines_sensory - Hanafi2008

This note replicates the analysis of the wines sensory dataset presented by Hanafi & Quanari (2008):

  • Hanafi, M., Qannari, E.M., 2008. Nouvelles propriétés de l’analyse en composantes communes et poids spécifiques. Journal de la société française de statistique 149, 75–97

where the ComDim method (Common components and specific weights analysis; aka CCSWA) is used to explore the data.

Wines sensory dataset: A jury of four expert tasters ("judges") evaluated the appearance of eight wines according to the procedure known as free profile. Each judge notes on a scale from 0 to 10 the products according to his/her own variables. For a product and a given variable, the note given by a judge is the intensity on which he/she perceives the variable. Each matrix Xk (k = 1, ..., 4) is associated to one judge.

The goal of the ComDim study was to evaluate if there was an agreement between judges or groups of judges and to assess the relationships among the eight wines.

Data importation

using Jchemo, JchemoData
using JLD2, CairoMakie
path_jdat = dirname(dirname(pathof(JchemoData)))
db = joinpath(path_jdat, "data/wines_sensory.jld2")
@load db dat
@names dat
(:X1, :X2, :X3, :X4)
X1 = dat.X1 
X2 = dat.X2
X3 = dat.X3
X4 = dat.X4
Xbl = [X1, X2, X3, X4]   # Same as: Xbl = [[X1]; [X2]; [X3]; [X4]]
4-element Vector{DataFrame}:
 8×4 DataFrame
 Row │ red      gilded   soft     plum
     │ Float64  Float64  Float64  Float64
─────┼────────────────────────────────────
   1 │     7.0      0.0      5.0      8.0
   2 │     5.0      6.0      6.0      3.0
   3 │     7.0      2.0      5.0      5.0
   4 │     5.0      7.0      7.0      4.0
   5 │     5.0      7.0      6.0      4.0
   6 │     6.0      8.0      6.0      1.0
   7 │     5.0      4.0     10.0      3.0
   8 │     6.0      6.0      6.0      5.0
 8×3 DataFrame
 Row │ ruby     coloured  intensity
     │ Float64  Float64   Float64
─────┼──────────────────────────────
   1 │     4.0       0.0        5.0
   2 │     3.0       6.0        5.0
   3 │     3.0       3.0        7.0
   4 │     1.0       6.0        3.0
   5 │     2.0       5.0        5.0
   6 │     1.0       5.0        4.0
   7 │     0.0       4.0        2.0
   8 │     2.0       6.0        4.0
 8×4 DataFrame
 Row │ red      blue     gilded   intensity
     │ Float64  Float64  Float64  Float64
─────┼──────────────────────────────────────
   1 │     7.0      4.0      2.0        6.0
   2 │     2.0      0.0      6.0        6.0
   3 │     6.0      3.0      4.0        7.0
   4 │     2.0      0.0      6.0        4.0
   5 │     5.0      1.0      5.0        6.0
   6 │     3.0      0.0      5.0        5.0
   7 │     2.0      0.0      4.0        3.0
   8 │     4.0      0.0      4.0        5.0
 8×3 DataFrame
 Row │ deep     expenses  brilliant
     │ Float64  Float64   Float64
─────┼──────────────────────────────
   1 │     9.0       7.0        9.0
   2 │     8.0       6.0        7.0
   3 │    10.0       6.0        7.0
   4 │     7.0       7.0        8.0
   5 │     8.0       7.0        8.0
   6 │     8.0       8.0       10.0
   7 │     6.0       5.0       10.0
   8 │     8.0       9.0       10.0
n = nro(X1)
nbl = length(Xbl)
4

Model fitting

Below, each block is scaled by its Frobenius norm, and three global components (scores) are computed

nlv = 3
bscal = :frob
model = comdim(; nlv, bscal)
fit!(model, Xbl)
fitm = model.fitm
@names fitm
(:T, :U, :W, :Tb, :Tbl, :Vbl, :lb, :mu, :fitmbl, :weights, :niter, :par)
  • Global scores

@head T = fitm.T
3×3 Matrix{Float64}:
 -2.50761    0.335161  -0.297369
  0.644503  -0.692286   0.247392
 -1.564     -0.540373   0.219861
... (8, 3)
i = 1 
f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, xlabel = string("PC", i), ylabel = string("PC", i + 1)) 
text!(ax, T[:, i], T[:, i + 1]; text = string.("w", 1:n), fontsize = 15)
f
  • Block scores and block loadings

Tbl = fitm.Tbl
4-element Vector{Matrix}:
 [-1.7962627913744986 -0.008918615631455138 -0.22665348117393225; 0.3879039556775733 -0.1065661506250574 0.2127782058108359; … ; 0.30099461206273337 -0.585749646663945 -1.0867198878257494; 0.035545038805747414 0.3417179548041017 0.07458461012188208]
 [-1.7044871814196665 0.5357943376354091 -0.4170589964322815; 0.1995449497670047 -0.6605733548084997 0.400922304133755; … ; 0.6150299861716623 0.9086489000953766 -1.0021145160270302; 0.5308334812361191 -0.13762360842297028 0.5902444200693617]
 [-1.6933885952738224 0.2296343453843308 -0.2616132417652677; 0.7329488213391452 -0.5749914914629086 0.17354614604739924; … ; 0.7929766870562747 0.48867740320692926 -0.7455120724195884; 0.12923331248947625 0.32227484677350104 0.1879793984911528]
 [-0.40801692432226466 0.46884767588169035 -0.36141026079567345; -0.2773904635616495 -0.9746086560813164 0.26113408530961424; … ; 1.0769677647773832 -0.11436320965711218 -1.2895662379308004; 0.2856187372760472 1.1470867080820784 0.436153894141458]
Vbl = fitm.Vbl
4-element Vector{Matrix}:
 [-0.23221350649942402 0.5891168732065994 0.07642515391317986; 0.7607464762538525 0.6064931208295313 0.674847582189589; 0.3011913315908775 -0.13118586846138441 -0.7251746631780844; -0.5259519636251038 0.5175883229032195 -0.11340919075085852]
 [-0.4537703469097276 -0.3697957744699179 0.38548166419002783; 0.7769886790381423 -0.31753284623159983 0.625096894897853; -0.4363267868375587 -0.8731689279562419 0.6787177311389095]
 [-0.6612798979736141 0.4222002594039563 0.3219773142067391; -0.5710752899312963 -0.15488207530536102 -0.08580550040851531; 0.37567655711660647 -0.7638562292199929 0.48684592681033495; -0.30894827107285366 -0.46290619437435704 0.8074336311945677]
 [-0.9493374803550765 -0.08204958567684245 0.5908887482427636; 0.0743842707357129 0.6405670458917806 0.6692544240608542; 0.3053282310239113 0.7635062050879062 -0.45049861606343977]
  • For block k

k = 1
Tbl[k]
8×3 Matrix{Float64}:
 -1.79626   -0.00891862  -0.226653
  0.387904  -0.106566     0.212778
 -0.945808  -0.110159     0.124653
  0.534977   0.161467    -0.00130687
  0.452331   0.206683     0.432083
  1.03032    0.101524     0.470582
  0.300995  -0.58575     -1.08672
  0.035545   0.341718     0.0745846
Vbl[k]
4×3 Matrix{Float64}:
 -0.232214   0.589117   0.0764252
  0.760746   0.606493   0.674848
  0.301191  -0.131186  -0.725175
 -0.525952   0.517588  -0.113409

Summary outputs

res = summary(model, Xbl)
@names res
(:explvarx, :explvarxx, :explxbl, :psal2, :contrxbl2t, :rvxbl2t, :rdxbl2t, :cortbl2t, :corx2t)
  • Proportion of the total XX' inertia explained by the global scores

res.explvarxx
3×4 DataFrame
Rowlvvarpvarcumpvar
Int64Float64Float64Float64
111.541010.6819710.681971
220.2298830.1017340.783705
330.1738080.07691810.860623
  • Proportion of the inertia of each block (= Xbl[k]) explained by the global scores

explxbl = res.explxbl  # = specific weights 'lb' when 'bscal = :frob'
4×3 DataFrame
Rowlv1lv2lv3
Float64Float64Float64
10.6704060.01176290.191347
20.6639540.04081960.216385
30.7800950.06058210.109844
40.205390.4737180.279832
rowsum(explxbl)
4-element Vector{Float64}:
 0.8735162349324337
 0.9211580695715044
 0.9505211974172872
 0.9589401500789885
  • Contribution of each block to the global scores

contrxbl2t = res.contrxbl2t
4×3 DataFrame
Rowlv1lv2lv3
Float64Float64Float64
10.2889870.0200430.239962
20.2862060.06955330.27136
30.336270.1032270.137751
40.08853610.8071770.350927
colsum(contrxbl2t)
3-element Vector{Float64}:
 0.9999999999999999
 1.0000000000000002
 1.0000000000000002
z = contrxbl2t
i = 1
f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = string("PC", i), 
    ylabel = string("PC", i + 1)) 
text!(ax, z[:, i], z[:, i + 1]; text = string.("j", 1:nbl), fontsize = 15) 
xlims!(ax, [0, .4]) 
ylims!(ax, [0, .9]) 
f
  • RV coefficients between each block and the global LVs

res.rvxbl2t
4×3 DataFrame
Rowlv1lv2lv3
Float64Float64Float64
10.8676270.01522980.247626
20.8958130.05504780.291952
30.9430260.07322250.132785
40.3135620.7233120.42727
  • Correlation between the X-variables and the global scores

corx2t = res.corx2t
14×3 DataFrame
Rowlv1lv2lv3
Float64Float64Float64
1-0.8356750.2808470.14693
20.8737580.09231350.414077
30.601193-0.0347772-0.773383
4-0.8260890.107675-0.095143
5-0.8111-0.1638560.393378
60.880278-0.08914910.404286
7-0.678057-0.3364070.602167
8-0.9315980.1657720.170208
9-0.983061-0.0743055-0.0554282
100.792075-0.4487440.385157
11-0.665309-0.2777490.652515
12-0.778559-0.1021330.56566
130.05852560.7651670.614377
140.2297520.872646-0.395764
z = corx2t 
nvar = nro(z) 
i = 1
f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = string("PC", i), 
    ylabel = string("PC", i + 1)) 
text!(ax, z[:, i], z[:, i + 1]; text = string.("v", 1:nvar), fontsize = 15) 
xlims!(ax, [-1, 1]) 
ylims!(ax, [-1, 1]) 
f

Simultaneous representation of observations and loadings

  • Block k in the global score space

k = 4
v = Vbl[k]
nam = names(Xbl[k])
i = 1
f, ax = plotxy(T[:, i], T[:, i + 1]; zeros = true, xlabel = string("PC", i), 
    ylabel = string("PC", i + 1)) 
text!(ax, T[:, i], T[:, i + 1]; text = string.("w", 1:n), fontsize = 15) 
scatter!(ax, v[:, i], v[:, i + 1]) 
text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) 
ylims!(ax, [-1, 1]) 
f
  • Block k in the block score space

k = 4  # fourth block
z = Tbl[k] 
v = Vbl[k] 
nam = names(Xbl[k]) 
i = 1
f, ax = plotxy(z[:, i], z[:, i + 1]; zeros = true, xlabel = string("PC", i), 
    ylabel = string("PC", i + 1), title = string("Block ", k)) 
text!(ax, z[:, i], z[:, i + 1]; text = string.("w", 1:n), fontsize = 15) 
scatter!(ax, v[:, i], v[:, i + 1]) 
text!(ax, v[:, i], v[:, i + 1]; text = nam, fontsize = 15) 
ylims!(ax, [-1, 1]) 
f

Main conclusion

Judge 4 was atypical from the other judges and determined the global score 2. In contrast to judges 1-3, judge 4 gave particularly high ratings to wines w6 and w8 (variable v13: expenses, and variable v14: brilliant).