She claimed that standing in a particular pose for 2 minutes a day could make you more successful. This became known as the power pose.
Her claims were based on a field study she conducted which measured self-reported feelings of power, observed risk taking + cortisol and testosterone levels from a sample of 42. Participants stood in two poses for a minute each - half in power poses, the others in non-power (weak) poses.
High power poses
Her study claimed that holding the power pose was like super power, giving you a physical and psychological edge in all kids of high stress circumstances, like negotiations and job titles.
Low power poses
For a few years the power pose was catnip to corporate and boardroom types. all sorts of c-suites and execs were known to strike a pose to improve their success.
Meanwhile in Nov 2011 a movement kicked off challenging the status quo in science. A number of scientists were worried about the quality of results getting published, and asking whether enough work is being done to independently validate results.
Brian Nosek of Open Science Foundation U of Virginia wanted to try to reproduce published studies to see if they could replicate the effects. This issue has been raised before. Feynman 1970s - Cargo Cult lecture.
You'll never believe what happened when scientists attempted to replicate 100 experiments...
Just wait til you see their results Psychologists tried to recreate 100 studies, all published recently in academic journals. the group went through extensive measures to remain true to the original studies, to the extent of consulting the original authors
97% of the original results showed a statistically significant effect, this was reproduced in only 36% of the replication attempts"
digest.bps.org.uk Their results were pretty poor.
And then this happened...
About a year after her viral TED fame, another researcher tried to replicate Amy Cuddy's results
This study failed to reproduce her findings on hormone changes with 4x the sample size. Her research partner Carney publically disavowed the work. Then everyone rushed to dubunk it.
What's to blame?
"Publish or perish"
No replication studies
Bad incentives! Scientific research in it's worst form: shitty, low-powered studies designed to be over-stated as click-bait titles and turned into scientific CV catnip
Sound similar to UXR?
"Prove me right"
Preferring 'hard data' over qual data
Bad incentives and a culture not set up for learning. Thinking numbers are more 'clean' or less biased than qual data.
UXR has different goals to Science
True. Maybe reproducibility isn't the goal. but it's good to know whats happening.
Chipping away at the crystal of knowledge
Scientific research: in it's purest form, a pursuit to chip off a small shard off the crystal of knowledge
Reducing business risk
UX research: in it's purest form, a pursuit to reduce risk and help a team intimately understand the people using their product
So does reproducibility even matter in UXR?
Said with ❤️ design thinking, double diamond, build-measure-learn loop, UX research - all inspired by the scientific method. Important to see what's happening in science as that's where we draw our legitimacy from.
Also great to keep in mind that we have lots of science 'truths' which are now debunked - don't want them to leak into our work accidentally
Make the hard problem of internal buy-in harder
UX Research is plagued by overgeneralisation and mismatches in tool / problem space. It's incredibly important to be specific about the things you can learn from any given research method
UX Research be reproducible? My answer is...
*It depends It depends on what you mean by reproducibility
At least three levels to reproducibility
1. Experiment design
Model & Methodology
Can I reproduce your experiment? As in methodology, model, sample set, sample size, research variables, etc
2. Data captured
Do I get equivalent results? As in raw data
Do I interpret the data the same way? As in do I draw the same conclusions from the raw data
UX Research types
Qualitative -vs- Quantitative
Depending on the type of research, we may not even be striving for 'statistical significance'
With Qual at least, we aren't striving for statistical significance and the question of reproducibility goes out the window.
✔ Data captured
× So when we capture small data samples and deep dive qualitative data, we don't expect to reproduce the same exact results because PEOPLE
✔ Data captured
Hopefully 🤞 But when we are doing AB tests or data driven tests into engagement, we should theoretically expect to be able to reproduce all aspects of that test
How do we do it better?
Get inspired by the Open Science movement
Within the constraints of our current workplaces...
So you want statistical significance
Spend some time booting up
Work on your research
design hygiene Get better at understanding the
strength of your signal Identify your variables. Don't tweak them once the test has started. pre-test probability / bayesian probability is an interesting rabbit hole for those who want to skill up
Lots of tools out there - use them! See how they're different and plugging the same results in to different tools will tell you different things.
nngroup.com/articles/which-ux-research-methods/ Reminder: we have this big world of research methods! We don't have to have statistical significance to learn something, or to decide what to try next. And thinking about getting better at research will help us navigate with more agility.
Like a scientist
Avoid speaking in absolutes
Cultivate your curiousity
"I don't know" Science is Moving towards certainty, but never 100% there.
Define your study
before you start "Preregistration". Is it qual or quant? Are you looking for statistical significance or not?
Plan for no conclusion
(unclear results) a null result is a result without the expected content: that is, the proposed result is absent. Regardless of your study type, sometimes the result will be that you don't know! Even with qual data sometimes you'll have lots of scattered signals but no clear signal from the noise. And just because you can't reject the null hypothesis doesn't mean you've proven it either.
Start by identifying a
good candidate study So replication studies - doing the same study again - helps us be more confident in the results.
expect the effect to be consistent over time
the hypothesis is important to business model functioning
product has been consistent since last study
don't expect the effect to be consistent
low risk to business model or low business priority
UX or UI in flux; too many variables to control for
Can be qual or quant. Does this effect or assumption have to remain in order for the business model to survive? Most quant testing will be hard to replicate and expect the same results- you might learn new things though. and this won't necessarily debunk the earlier results, you just won't be able to match them because of the changes that have happened.
Consider trying synthesis and interpretation with multiple groups of researchers
- interpretation by the people who didn't collect the data
- needs experienced UXR people
- compare summaries / proposed next steps
- mismatch could be an opportunity to learn
Learn to read references
Paper abstract is a good start
Develop your sniff test
When you're reading or watching the news, actually look for and check the science citation! Learn to read scientific papers. Tell story about + 40% GDPR - adelaide uni ML brochure - if time allows.
...? * open data * open results * open methods * share research templates * FOSS style? Make skill guilds across businesses? Peer review each other's research? Do meta studies? Come talk to me after and let's discuss.
...? visualising what we know over time. change points. discovery points. a resource for the entire team and a different way of thinking about building a body of knowledge.
Build culture for continuous learning
Replication studies (maybe)
Spend more time on research design & analysis
Uncertain results are ok