Science for product makers: sensory evaluation methodologies and result analysis

Written by Dr. Harold Han - "The Happy Chemist" | 6/30/24 4:05 AM

This post is also published as an article on Harold's LinkedIn profile. You can read and leave comments here.

My past two posts discussed the subjective nature of sensory evaluation, how to build a supertaster team, the concept of the “Golden Tongue”, and the operation details of conducting a sensory evaluation. In this article, I’ll discuss the methodology to present samples and how to analyze results. This is the most critical step of the evaluation process.

Step one: have a clear goal

What are we trying to answer from this test? Are you picking the less bitter sample from two formulas? Are you confirming if an ingredient switch would cause a flavor change? Are you trying to pass sensory quality on the production floor? The sample presentation approach will differ for each goal.

Step two: align on who and how many people will taste

Usually two approaches are used: a small / selected group of superstaters with a “Golden Tongue”or a large / unselected group to represent the consumer demographic. Your approach will impact the number of tasters and your data analysis. Over decades of research, flavor scientists have designed different sampling methods. Details about this can be found in this review and this book. But today, we will only focus on two basic methods that we believe are most useful for the cannabis beverage industry.

Pair Test (or A/B Test)

This is the simplest and most effective method for picking the better sample out of two. Present two samples with random sample codes. Tasters try the first sample, record sensory keywords, clean their palates, try the second one, then record notes again. Its simplicity allows tasters to only focus on any slight flavor difference between the two samples.

Sample presentation for Pair Test

Triangle Test

In other situations, you want to ensure the flavor does not change, compared to the original batch. In this case, the triangle test is the best method. Here’s two real example scenarios that can leverage a triangle test:

Flavor matching your original SKU when switching ingredient providers
Confirming consumer complaints about the flavor of a particular batch to an older batch

Performing the triangle test

The basis of the triangle test is to present 3 samples: 2 of them are the same and 1 is different. The goal is to see if tasters can pick up which one tastes different.

Sample presentation for Triangle Test

The concept and execution of Pair and Triangle tests are straightforward, but the data analysis can be tricky.

Let’s discuss how to analyze the data in two scenarios.

Scenario one: small/selected group

This group includes supertasters and the “Golden Tongue”. Usually a consensus can be reached among supertasters because their palates are very sensitive and they know the products well. And if there is conflict, the “Golden Tongue” would cast the determining vote.

Scenario two: large/unselected group

Intuitively, we believe more tasters better represents broader consumer feedback. At the same time, having more tasters also introduces data variation due to genetic tasting capabilities. To address this, we need to introduce the concept of P-value from statistics, which is a number used to evaluate the significance of a complex result, especially when it is based on obvious variances. Calculating the p-value is complicated and usually requires the use of special statistical software, but it is the key to analyzing results accurately. Here is a detailed article about P-value if you want to dig in more.

Evaluating the significance of your results

Let’s say we conducted two Pair Tests. One with 10 tasters and the other with 100. The final results are in the table below. Can we confidently say sample A is better than sample B in both situations?

You may think that since both scenarios have 60% people picking sample A, A should be better than B. However, statistically, we can only confidently say that of the results from the test conducted with 100 tasters. Why can’t we reach the same conclusion about the results from the test conducted with 10 tasters? This is all based on the P-value.

Let’s assume sample A is indeed only slightly better than B and they may be perceived as similar in some palates. When given to a wide range of tasters with different tasting capabilities, some people may not be able to discern a difference, but they HAVE TO pick one. So they have a 50% chance of picking the right answer (Sample A). 50% is a high probability, which can be purely based on guessing. This creates a problem: how many people need to select Sample A in order for us to confidently believe Sample A is indeed better but not based on guessing?

Using special statistical software, we can determine the number of Sample A votes needed in order to achieve a p-value of < 0.05. The smaller the p-value, the more significant the results. Let’s look at the chart below. If we have 10 tasters, we need 9 of them to pick one sample to achieve p-value < 0.05, but if we have 100 tasters, we need 59. So this tells us that while 6/10 and 60/100 both equal 60%, 6 votes out of 10 does not constitute a significant difference of opinion.

The same logic applies to the Triangle test, where tasters have to pick 1 sample out of 3, even if they cannot discern the difference. There is a 33% chance for anyone to pick the correct sample. The table below shows how many people need to pick the correct sample for us to be confident that sample indeed tastes differently. Here is a P-value calculator I found useful.

Sensory evaluation mixes science, art and human physiology together. It seems simple but getting accurate results can be tricky.

A lot of details have been shared in the last 3 posts on sensory. Let’s review:

Genetic differences make people perceive sensory differently. This subjectivity creates one of the biggest challenges in sensory evaluation.
A group of supertasters can be leveraged to do sensory work. This will lead to the more consistent and confident results.
In a startup where decisions need to be made quickly, a “Golden Tongue” usually casts the determining vote on flavor.
Removing bias during sensory evaluation is key. There are many details during the tasting process to help remove bias.
The correct testing methods need to be applied based on what you want to achieve from sensory testing. Pair and Triangle tests are the most common methods.
When tasting with a large / unselected group, P-value (< 0.05) is needed to gauge significance.

What did you think of this series? Have you participated in a sensory analysis before? Let me know in the comments!

Dr. Harold Han — the “Happy Chemist” — combines his storied background in emulsion chemistry and science with curiosity and fascination in the rapidly growing cannabis industry. Developing nano and micro emulsions his entire career, Harold holds a Ph.D in Surface Chemistry from NYU and is the holder of multiple patents for his inventions in emulsion chemistry.

As the Chief Science Officer at Vertosa, Harold spearheads the company’s development of industry-leading and customized active ingredients for infused product makers, offering pre-suspended aqueous solutions to create incredibly homogenous and stable products while maximizing bioavailability, clarity, and taste.

To learn more about the science of cannabis, make sure to follow Harold on LinkedIn and check out his Happy Chemist videos.

View full post