Abstract

First of all we looked at what are the most practiced katas. The most practiced kata were Anan Dai (11.56%) , followed by Suparinpei (11.47%) , Papuren (11.42%), Unsu (9.95%) and Anan (8.76%). There is a very significant relationship between the athlete's category and their choice of kata ($\chi^2$(30), p-value : 6.84E-279). Athletes in the individual female category prefer Shitoryu katas. On the contrary, the athletes of the men's team category prefer Shotokan katas. Concerning the athletes of the women's team and men's individual category, the most practiced katas are relatively varied between Shotokan and Shitoryu.

Then, we studied the strategies of choice of katas according to the round. We focused on the two styles of karate practiced by the vast majority of athletes. For shotokan style karatekas, Unsu kata and, to a lesser extent, Gojushiho Sho kata, are widely used in both the first and final rounds. On the other hand, Gojushiho Dai is more used in the first rounds, while Kanku Sho and Gangaku are more used in the last rounds. As well as Unsu and Gojushiho Sho, Anan, Anan Dai and Papuren are katas very much used by shitoryu karatekas whatever the round. Chatanyara Kushanku is a kata that is quite practiced in the later rounds, unlike Suparinpei which is very much practiced in the earlier rounds.

Finally, we compared the scores obtained between shotokan and shitoryu katas. Without taking into account the categories, the shitoryu katas obtain better scores, in a significant way (T-test, p-value = 1.36E-16). Without assuming anything about the probability law followed by the scores, we obtain the same results (Mann-Whitney U-test, p-value = 1.02E-15). Taking into account the categories, we have in each category an advantage of the shitoryu style over the shotokan style (except for the men's team category).

1 Introduction
- 1.1 What are the kata competitions?
- 1.2 Purpose of this notebook
2 Data collecting
3 Load data
4 Dataset description
5 Most popular katas
6 Athletes' strategies
- 6.1 Kata per round: visualization
- 6.2 Kata per round: analysis track
7 Scores analysis
8 Conclusion
9 References
10 Further information

Introduction¶

What are the kata competitions?¶

In martial arts, a kata is a codification of technical sequences. The set of karate katas constitutes a historical heritage. Practitioners use it as a database on which they can work on the essential elements of karate (such as strength, speed, explosiveness, balance, flexibility, concentration, aesthetics, etc.). The goal is to assimilate the techniques of a kata in order to be able to adapt them in different combat situations.

Official karate competitions are divided into two disciplines: Kumite and Kata. The first discipline is a fighting competition between two people, which is common in combat sports. The second is a competition in which, in each round, the participant must perform a kata of his/her choice from the WKF (World Karate Federation) kata list. Depending on their performance, participants either move on to the next round or are eliminated. These competitions can be individual or team competitions.

The following video summarizes the evaluation criteria for kata performance.

Purpose of this notebook¶

In 2019, the rules of kata competitions have been transformed. In the past, kata competitions consisted of individual confrontations. Now it has been replaced by a system of ranking by score where the best ranked move on to the next rounds, and the others are eliminated.

There are a few studies on katas in high level competition. However, the knowledge on this field is still new and limited.

The objectives of this notebook are then as follows:

to collect as much data as possible on high level kata competitions since the implementation of the new rules.
to show which katas are the most used, in particular which katas are used for the first rounds or for the finals
to show which katas are the best rated by the jury
to compare the scores obtained by the katas of particular styles of karate.
to propose visualizations for the above points, for general and category-specific data (men, women, teams, etc.)

Data collecting¶

The data collected is from various WKF senior (i.e. over 18 years old) competitions that have taken place since the implementation of the new kata competition rules:

Karate1 Premiere League Paris 2019
Karate1 Premiere League Dubai 2019
Karate1 Series A Salzburg 2019
PKF Senior Championships 2019
EKF Senior Championships 2019
OKF Senior Championships 2019
SEAKF Senior Championships 2019
Asian University Karate Championships 2019
Karate1 Premiere League Rabat 2019
Balkan Senior Championships 2019
Karate1 Series A Istanbul 2019
Karate1 Premiere League Shanghai 2019
Karate1 Series A Montreal 2019
UFAK Championships 2019
AKF Senior Championships 2019
All African Games 2019
Karate1 Premiere League Tokyo 2019
Small States of Europe Karate Championships 2019
Karate1 Series A Santiago 2019
Karate1 Premiere League Moscow 2019
Karate1 Premiere League Madrid 2019
SEA Games Philippines 2019
Karate1 Series A Santiago 2020
Karate1 Premiere League Paris 2020
UFAK Championships Tangier 2020
Karate1 Premiere League Dubai 2020
Karate1 Premiere League Salzburg 2020
Karate1 Premiere League Istanbul 2021
Karate1 Premiere League Lisbon 2021
EFK Senior Championships 2021
Tokyo Olympic Qualification Tournament 2021

The data was collected on the website sportdata.org. For a given kata, we record the name of the kata, the score obtained, the category of the competition, the number of rounds remaining before the final, and, if needed, the particularity of the round (for example, a round for a medal or a round to break a tie). These data are readable on the different pages of the website, but this website does not provide a file gathering all the data. For this reason, such a file will not be provided publicly with this notebook (the data migration has been done privately). Moreover, no names or personal information has been collected.

Load data¶

The archive kata.csv.zip (not provided with this notebook as explained in the previous section) gathers all the collected data in a table in csv format.

Here is a glimpse of a few lines from this table.

kata	score	category	matchs_before_final	special	tournament
Unsu	24.460000	Male	3	No	K1 Premiere League Madrid 2019
Kanku Sho	24.980000	Male	1	No	K1 Series A Santiago 2020
Ohan	24.680000	Male	3	No	K1 Premiere League Madrid 2019
Anan Dai	24.080000	Male	1	No	UFAK Championships 2019
Unsu	20.500000	Male	4	No	K1 Series A Salzburg 2019

This data will be loaded by our program, to allow us to perform all the data analysis and visualization.

Dataset description¶

Most popular katas¶

There are 102 katas that are allowed in official competitions. However, only a part of them are actually practiced in high-level competitions. Indeed, the athletes (and their coaches) choose the katas that will allow them to show off their skills the most. Thus, the choice of kata is essential. The complexity of the techniques present in the kata, the presence of jumps, the intensity of the rhythm and the overall duration of the kata, is taken into account. Many katas are therefore considered "too simple" and are never practiced in high-level competitions.

Pie chart¶

We can see below the katas that are the most popular, those that have been the most performed.
(You can select a specific category with the button on the left)

Table of the number of uses of the katas
kata	Female (Individual)	Male (Individual)	Female (Team)	Male (Team)	Female	Male	Team	Individual	All
Anan Dai	318	336	39	24	357	360	63	654	717
Suparinpei	356	336	8	11	364	347	19	692	711
Papuren	538	122	30	18	568	140	48	660	708
Unsu	124	361	40	92	164	453	132	485	617
Anan	197	232	56	58	253	290	114	429	543
Gojushiho Sho	122	301	34	83	156	384	117	423	540
Chatanyara Kushanku	314	167	9	22	323	189	31	481	512
Gojushiho Dai	119	195	35	17	154	212	52	314	366
Kanku Sho	49	166	24	90	73	256	114	215	329
Gankaku	73	191	0	15	73	206	15	264	279
Chibana No Kushanku	98	42	9	4	107	46	13	140	153
Ohan Dai	25	99	1	3	26	102	4	124	128
Kururunfa	74	35	10	5	84	40	15	109	124
Paiku	18	23	55	10	73	33	65	41	106
Sansai	24	78	0	4	24	82	4	102	106
Ohan	13	38	10	4	23	42	14	51	65
Enpi	13	15	2	7	15	22	9	28	37
Nipaipo	15	12	4	0	19	12	4	27	31
Tomari Bassai	23	0	1	0	24	0	1	23	24
Sochin	9	6	0	2	9	8	2	15	17
Heiku	4	0	12	1	16	1	13	4	17
Gojushiho	5	4	5	1	10	5	6	9	15
Oyadomari No Passai	7	0	0	0	7	0	0	7	7
Kusanku	2	5	0	0	2	5	0	7	7
Shisochin	0	6	0	0	0	6	0	6	6
Kousoukun Dai	3	3	0	0	3	3	0	6	6
Unshu	3	1	2	0	5	1	2	4	6
Pachu	0	4	0	0	0	4	0	4	4
Kyan No Chinto	1	3	0	0	1	3	0	4	4
Bassai Dai	2	1	0	0	2	1	0	3	3
Kousoukun Sho	1	0	1	1	2	1	2	1	3
Chinto	1	1	0	0	1	1	0	2	2
Seisan	0	1	0	0	0	1	0	1	1
Chinte	1	0	0	0	1	0	0	1	1
Seipai	0	1	0	0	0	1	0	1	1
Sanseiru	0	1	0	0	0	1	0	1	1
Matsumura Bassai	0	1	0	0	0	1	0	1	1
Kishimoto No Kushanku	1	0	0	0	1	0	0	1	1
Kanku Dai	1	0	0	0	1	0	0	1	1
Nijushiho	1	0	0	0	1	0	0	1	1

Legend :

- Shotokan style katas

- Shitoryu style katas

- Shitoryu and Shotokan style katas

- Katas of other styles

Katas preferences by category¶

We now want to see if the choices of katas are related to the sport categories. For that we will carry out a mathematical test of inferential statistics. We take into account only the katas used at least 5 times in each category so that the test conditions are optimal.

Pearson's chi-squared test¶

We will perform a Pearson's $\chi^2$ test.

skip details

Details:

Let $X$ and $Y$ be two random variables that represent respectively the kata performed and the sport category of the participant. We want to know if $X$ and $Y$ are independent. The hypotheses of our statistical test are:

$H_0$ : $X$ and $Y$ are independent (null hypothesis).
$H_1$ : $X$ and $Y$ are related (alternative hypothesis).

We study the realization of the pairs $(X,Y)$ of the total sample (the number of times each kata was used for each of the categories is observed). Let us note $O_{ij}$ the observed number of data for which $X$ takes the value $i$ and $Y$ the value $j$. Here is our contingency table consisting of the $O_{ij}$ values observed for any pair $(i,j)$.

Contigency table (Observed values)
	Female (Individual)	Male (Individual)	Female (Team)	Male (Team)
Anan	197	232	56	58
Anan Dai	318	336	39	24
Chatanyara Kushanku	314	167	9	22
Gojushiho Dai	119	195	35	17
Gojushiho Sho	122	301	34	83
Kanku Sho	49	166	24	90
Kururunfa	74	35	10	5
Paiku	18	23	55	10
Papuren	538	122	30	18
Suparinpei	356	336	8	11
Unsu	124	361	40	92

Under the null hypothesis $H_0$, the observed differences in frequencies are only due to sampling fluctuations and not to any link between the kata and the category. We can calculate the expected values under $H_0$. For a given pair $(i,j)$, the expected value is $E_{ij}=\dfrac{ {\sum_{i}{O_{ij}}} \times {\sum_{j}{O_{ij}}} }{\sum_{i}\sum_{j}{O_{ij}}}$ Here is the table of expected values under $H_0$.

Expected values under the null hypothesis
	Female (Individual)	Male (Individual)	Female (Team)	Male (Team)
Anan	229.536696	234.170681	35.012327	44.280296
Anan Dai	303.089892	309.208800	46.231747	58.469562
Chatanyara Kushanku	216.432391	220.801821	33.013465	41.752323
Gojushiho Dai	154.715342	157.838801	23.599469	29.846387
Gojushiho Sho	228.268538	232.876920	34.818889	44.035653
Kanku Sho	139.074720	141.882420	21.213730	26.829130
Kururunfa	52.417220	53.475441	7.995449	10.111891
Paiku	44.808269	45.712877	6.834819	8.644036
Papuren	299.285416	305.327518	45.651432	57.735634
Suparinpei	300.553575	306.621278	45.844870	57.980277
Unsu	260.817940	266.083444	39.783804	50.314811

We compute the distance between the observed values $O_{ij}$ and the expected values $E_{ij}$ if there was independence : $D = \sum_i \sum_j \dfrac{ ( O{ij} - E_{ij} )^2 }{ E_{ij} } $

Pearson's chi-squared test results¶

Analysis track¶

In the Pie chart section we can see which katas are the most practiced according to the category.

The differences in the choice of katas between the categories are significant as we showed in the previous section.

Athletes' strategies¶

In a competition, the more participants there are, the more rounds there will be. However, a competitor cannot perform the same kata several times in the same competition. Therefore, the choice of the different katas and their order must be meticulously planned.

Thus, different strategies are possible. In general, competitors perform katas of the style of karate they specialize in (few perform katas of different styles). Some prefer to save what they consider their best katas for the last rounds. Others prefer to use them before, to increase their chances of passing certain rounds. The idea here is to see if a pattern seems to emerge: if some katas are frequently used in the first rounds and if others are rather last rounds katas.

Kata per round: visualization¶

In this section, we will see which katas are the most used according to the round. As the number of rounds varies from one competition to another, so we are interested in the number of rounds left before the final.

We will focus on the most used katas in the two styles over represented in high level competition: Shotokan and Shitoryu styles.

You can select a specific category with the button.

Kata per round: analysis track¶

For Shotokan katas, we can see that Unsu is the kata of Shotokan finalists. It is a "safe bet", as well used in the first rounds as in the last ones. Gojushiho Sho, although less used, is also a very popular kata in the first and last rounds. It can be noticed that Gojushiho Dai is a kata that is mostly used in the first rounds (it is very used when there are three rounds left before the final). On the other hand, Gangaku and Kanku Sho are kata that are used in the last rounds and not much in the first rounds. These observations are similar in each category.

For Shitoryu katas, we can notice that Anan, Anan Dai and Papuren are very used katas whatever the round. Chatanyara Kushanku is rather a kata practiced in the last rounds, unlike Suparinpei which is very much used in the first rounds. These observations are similar when considering the different categories. It is interesting to note that Paiku is used a lot in the women's team category, regardless of the rounds.

Scores analysis¶

In this section, we are interested in the scores obtained.

Score by category¶

First of all, we will see a box plot that gives the "five-number summary" (median, upper and lower quartiles, minimum and maximum) of the scores obtained by the athletes.

This graph provides an overview. We have not taken into account performances with a score of zero (which correspond to disqualifications). We can see that the median scores are very close. There is no overwhelming difference that is visible between the scores obtained in the different categories.

Score by kata¶

Now we will see which katas have obtained the best scores. It is important to remember that the score obtained depends above all on the performance of the athlete, especially since minor mistakes or small imbalances greatly affect the score obtained in high level competitions. For this reason we will not calculate the average score but rather the median score of each kata, and in each category.

First, we represent the median scores in a graph. The outliers (katas that have been performed less than one time out of a hundred) are not taken into account.
(You can select a specific category with the button on the left)

In a second step, all the median scores for each kata and for each category are represented in a table. Here all the katas are taken into account (even the outliers).

Median score per kata
kata	Female (Individual)	Male (Individual)	Female (Team)	Male (Team)	Female	Male	Team	Individual	All
Kishimoto No Kushanku	27.54	nan	nan	nan	27.54	nan	nan	27.54	27.54
Oyadomari No Passai	25.62	nan	nan	nan	25.62	nan	nan	25.62	25.62
Seisan	nan	24.88	nan	nan	nan	24.88	nan	24.88	24.88
Shisochin	nan	24.53	nan	nan	nan	24.53	nan	24.53	24.53
Ohan	23.72	24.47	25.08	25.20	24.54	24.47	25.08	24.46	24.48
Kururunfa	24.20	24.94	24.75	24.20	24.35	24.50	24.48	24.34	24.38
Kyan No Chinto	24.20	24.48	nan	nan	24.20	24.48	nan	24.34	24.34
Ohan Dai	23.86	24.36	21.78	23.20	23.83	24.29	22.49	24.17	24.13
Tomari Bassai	24.18	nan	22.76	nan	23.95	nan	22.76	24.18	23.95
Sochin	22.34	24.33	nan	24.77	22.34	24.33	24.77	23.94	23.94
Chibana No Kushanku	24.50	23.63	23.20	24.37	24.34	23.66	23.66	24.11	23.88
Sansai	23.21	23.90	nan	22.31	23.21	23.84	22.31	23.78	23.75
Suparinpei	23.78	23.66	24.41	23.46	23.79	23.66	23.72	23.75	23.74
Anan	23.34	23.86	24.20	24.68	23.54	24.00	24.21	23.60	23.74
Chatanyara Kushanku	23.86	23.67	22.60	23.70	23.80	23.67	23.52	23.76	23.74
Anan Dai	23.40	23.94	23.48	24.63	23.41	24.00	24.00	23.67	23.68
Kanku Sho	22.80	23.68	23.34	24.14	23.02	23.92	23.98	23.45	23.66
Papuren	23.54	23.58	23.57	24.27	23.54	23.72	23.77	23.54	23.58
Unsu	23.34	23.20	24.27	24.80	23.40	23.58	24.66	23.22	23.54
Gankaku	23.40	23.52	nan	25.52	23.40	23.63	25.52	23.46	23.53
Matsumura Bassai	nan	23.40	nan	nan	nan	23.40	nan	23.40	23.40
Gojushiho Dai	22.88	23.34	23.28	23.46	22.91	23.38	23.34	23.13	23.17
Kousoukun Sho	23.16	nan	24.22	21.12	23.69	21.12	22.67	23.16	23.16
Gojushiho Sho	22.74	23.33	22.99	23.39	22.80	23.34	23.28	23.14	23.16
Paiku	22.58	23.40	23.14	24.27	23.03	23.40	23.24	23.16	23.16
Kousoukun Dai	20.78	24.74	nan	nan	20.78	24.74	nan	23.05	23.05
Unshu	23.78	20.68	22.94	nan	23.74	20.68	22.94	23.03	23.01
Sanseiru	nan	22.94	nan	nan	nan	22.94	nan	22.94	22.94
Nipaipo	22.72	23.62	22.77	nan	22.72	23.62	22.77	22.92	22.92
Kusanku	22.64	23.72	nan	nan	22.64	23.72	nan	22.86	22.86
Heiku	21.20	nan	22.73	20.40	22.46	20.40	22.66	21.20	22.26
Gojushiho	20.02	23.59	22.60	21.46	21.56	23.12	22.17	22.26	22.26
Pachu	nan	21.90	nan	nan	nan	21.90	nan	21.90	21.90
Enpi	21.80	21.52	20.79	22.54	21.65	22.15	22.50	21.80	21.80
Chinto	18.32	24.42	nan	nan	18.32	24.42	nan	21.37	21.37
Seipai	nan	21.28	nan	nan	nan	21.28	nan	21.28	21.28
Nijushiho	21.00	nan	nan	nan	21.00	nan	nan	21.00	21.00
Chinte	20.02	nan	nan	nan	20.02	nan	nan	20.02	20.02
Kanku Dai	19.76	nan	nan	nan	19.76	nan	nan	19.76	19.76
Bassai Dai	18.86	16.46	nan	nan	18.86	16.46	nan	17.84	17.84

Legend :

- Shitoryu style katas

- Shotokan style katas

- Shitoryu & Shotokan style katas

- Other style katas

We can see that the kata Kishimoto No Kushanku has the highest median score. However, we could see (Section Pie chart : "Table of the number of uses of the katas") that this kata has been practiced only once. We can then ask ourselves if this score depends above all on the performance of the athlete or if it is also the fact of having presented a kata never presented, which allowed the participant to be distinguished.

Shotokan and Shitoryu: Unequal scores¶

The kata styles over-represented in high level competition are Shotokan and Shitoryu. We want to know if there is a significant difference in the scores obtained according to the style.

Results:

Differences in variances (F-test)

Differences in average scores (Welch T-test)

Differences by category (F-test,Welch T-test)

Differences in average scores (Mann-Whitney U-test)

Differences by category (Mann-Whitney U-test)

F-test of equality of variances¶

skip details

Details :
Let $U$ and $V$ be two random variables that represent respectivly the score obtained after a performance on a shotokan kata and on a shitoryu kata. We observe $U_1, ... , U_{n_1}$ which are $n_1$ random realizations of $U$ and $V_1, ... , V_{n_2}$ which are $n_2$ random realizations of $V$. The two samples are independent. We want to know if their respectives variances $\sigma_1^2$ and $\sigma_2^2$ are equal.

$H_0$ : $\sigma_1^2=\sigma_2^2$ (null hypothesis).
$H_1$ : $\sigma_1^2 \neq \sigma_2^2$ (alternative hypothesis).

Let $\mu_1$ and $\mu_2$ be the respectives overall means of $U$ and $V$. Since $n_1$ and $n_2$ are large enough, we approximate the laws of the means $\displaystyle M_1=\dfrac{1}{n_1}\sum_{i=1}^{n_1} U_i$ and $\displaystyle M_2=\dfrac{1}{n_2}\sum_{i=1}^{n_2} V_i$ by respectively $N\left(\mu_1, \dfrac{\sigma_1^2}{n_1}\right)$ and $N\left(\mu_2,\dfrac{\sigma_2^2}{n_2}\right)$ Gaussian distributions. We denote $\displaystyle S_1^2=\dfrac{1}{n_1-1}\sum_{i=1}^{n_1}{(U_i - M_1)^2}$ the unbiased estimator of the variance of $U$ and $\displaystyle S_2^2=\dfrac{1}{n_2-1}\sum_{i=1}^{n_2}{(V_i - M_2)^2}$ the unbiased estimator of the variance of $V$. According to Student's theorem, $K_1=\dfrac{n_1-1}{\sigma_1^2}S_1^2$ follows a $\chi^2(n_1-1)$ distribution and $K_2=\dfrac{n_2-1}{\sigma_2^2}S_2^2$ follows a $\chi^2(n_2-1)$ distribution. Therefore, $F=\dfrac{K_1 \big/ (n_1-1) }{K_2 \big/ (n_2-2) }=\dfrac{S_1^2 \big/ \sigma_1^2 }{S_2^2 \big/ \sigma_2^2 }$ follows a Fisher Law $F(n_1-1,n_2-2)$. Under $H_0$, $F=\dfrac{S_1^2}{S_2^2}$. We compute the observed variances $s_1^2$ and $s_2^2$ on the respectives samples $U_1,...U_{n_1}$ and $V_1,...V_{n_2}$.

We perform a F-test of equality of variances in order to know if variances are equals or not. Our test statistic is $f= \dfrac{s_1^2}{s_2^2}$.

T-test of comparison of means¶

skip details

We compute $m_1$ and $m_2$ the average values of the respective realizations of $U$ and $V$.

We want to know if there is a significant difference between the average Shotokan and Shitoryu scores.

$H_0$ : $\mu_1=\mu_2$ (null hypothesis).
$H_1$ : $\mu_1 \neq \mu_2$ (alternative hypothesis).

As $\sigma_1^2$ and $\sigma_2^2$ are significantly different, it is wise to perform a Welch T-Test for testing statistically the hypothesis of equality of two means with two samples of unequal variances. Under $H_0$, $T=\dfrac{M_1-M_2}{\sqrt{\dfrac{S_1^2}{n_1} + \dfrac{S_2^2}{n_2}}}$ follow a Student law with $\nu$ degrees of freedom (with $\nu$ given by the Welch-Satterthwaite equation). Our test statistic is $t= \dfrac{\mu_1-\mu_2}{\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}}$.

Style scores according to the category¶

We perform the same tests as before, this time for each category.

Female (Individual):

$n_1=521$, $n_2=2004$, $m_1=22.778$, $m_2=23.642$, $s_1^2=2.547$, $s_2^2=2.758$

Variances seem to be equal (p-value: 0.869).
With a confidence level of 95%, (p-value = 5.03E-26) there is a significant different between scores: participants get higher scores with Shitoryu katas.

Male (Individual):

$n_1=1265$, $n_2=1323$, $m_1=23.334$, $m_2=23.665$, $s_1^2=3.036$, $s_2^2=2.771$

Variances seem to be equal (p-value: 0.05).
With a confidence level of 95%, (p-value = 8.63E-07) there is a significant different between scores: participants get higher scores with Shitoryu katas.

Female (Team):

$n_1=135$, $n_2=174$, $m_1=23.284$, $m_2=23.664$, $s_1^2=2.032$, $s_2^2=1.713$

Variances seem to be equal (p-value: 0.146).
With a confidence level of 95%, (p-value = 0.017) there is a significant different between scores: participants get higher scores with Shitoryu katas.

Male (Team):

$n_1=308$, $n_2=144$, $m_1=24.047$, $m_2=24.102$, $s_1^2=1.958$, $s_2^2=2.081$

Variances seem to be equal (p-value: 0.671).
We do not reject H0 (no significant difference between Shotokan and Shitoryu average scores, p-value = 0.702).

Non-normality of the data¶

skip details

For the previous tests, we assumed that the distributions of $M_1$ and $M_2$ were Gaussian because $n_1$ and $n_2$ were large enough. This approximation is relevant according to the Central Limit Theorem. However, it may be interesting to test if the distribution of the data is indeed Gaussian, and, if not, to perform a non-parametric test (of course, less powerful) without assuming anything about the distribution of the data.

Visual check of normality¶

The distributions appear visually Gaussian on the histogram. However, there are small deviations on the Q-Q plots at the extreme values.

Mathematical test of normality¶

We will now use a statistical test of normality: the Shapiro-Wilk test. Test 1 : We want to know if $U$ is Gaussian.

$H_0$ : $U$ is gaussian (null hypothesis).
$H_1$ : $U$ is not gaussian (alternative hypothesis).

Test 2 : We want to know if $V$ is Gaussian.

$H_0$ : $V$ is gaussian (null hypothesis).
$H_1$ : $V$ is not gaussian (alternative hypothesis).

Shapiro-Wilk test results:¶

Mann-Whitney U-test¶

skip details

From now on, we do not assume anything about the distributions of the Shotokan and Shitoryu scores. We perform the non-parametric Mann-Whitney U-test. Our statistic is $J = \sum_{i=1}^{n_1}{\sum_{i=1}^{n_2}{ g(U_i,V_i) }}$ with $g(X,Y)=1$ if $Y < X$, $g(X,Y)=1/2$ if $Y = X$ and $g(X,Y)=0$ if $Y > X$

Mann-Whitney U-test results¶

Mann-Whitney U-test by category¶

We perform the same test for each category.

Female (Individual):

$n_1=521$, $n_2=2004$, $m_1=22.778$, $m_2=23.642$, $s_1^2=2.547$, $s_2^2=2.758$

With a confidence level of 95% (p-value = 1.85E-23), Shitoryu katas get higher scores.

Male (Individual):

$n_1=1265$, $n_2=1323$, $m_1=23.334$, $m_2=23.665$, $s_1^2=3.036$, $s_2^2=2.771$

With a confidence level of 95% (p-value = 3.32E-07), Shitoryu katas get higher scores.

Female (Team):

$n_1=135$, $n_2=174$, $m_1=23.284$, $m_2=23.664$, $s_1^2=2.032$, $s_2^2=1.713$

With a confidence level of 95% (p-value = 0.006), Shitoryu katas get higher scores.

Male (Team):

$n_1=308$, $n_2=144$, $m_1=24.047$, $m_2=24.102$, $s_1^2=1.958$, $s_2^2=2.081$

We do not reject H0 (no significant difference between Shotokan and Shitoryu average scores, p-value = 0.29)

We observe that the result obtained is the same as in the case where we assumed that the distributions are Gaussian, except in the Female Team category where the shitoryu katas significantly get higher scores. The global conclusion remains the same, there is a significant advantage of Shitoryu style over Shotokan style (except in the Male Team category)

Conclusion¶

The data analysis can give us some strategic ideas for high level kata competitions. We can see from the visualizations that very rarely used katas (such as Kishimoto No Kushanku or Ohan) can get much higher scores than frequently practiced katas. Rare katas seem to attract the attention of the judges. Moreover, as we have shown that there is a significant relationship between the choice of katas and the sport category, we can relate this information in order to establish a strategy of choice of kata and an adapted training. We can also ask ourselves the question of the imbalance between styles. We could see that, in all categories (except the Male Team category), the Shitoryu style had an advantage over the Shotokan style. We could ask ourselves if this disadvantage is linked to strategic mistakes in the choice of kata: for example, in the Individual Male category, the kata Kanku Sho is the Shotokan kata which obtains the best marks, even though it is not a kata which is practiced a lot like Unsu or Gojushiho Sho. One could also wonder if this is not rather a lack of Shotokan kata themselves. In this case, it might be wise for Shotokan specialists to open up to other styles in order to increase their range of kata choices.

However, this analysis is limited to high-level competitions. The data are not representative of local and regional competitions which correspond, in terms of number, to the vast majority of karate kata competitions.

To conclude, thanks to the interactive tables and graphs in this notebook, readers will be able to read the data in detail and do their own analysis.

References¶

Rosenbaum, Michael. Kata and the Transmission of Knowledge in Traditional Martial Arts. YMAA Publication Center, Boston, 2004.
Statutes & Rules. (n.d.). World Karate Federation. Retrieved July 20, 2021, from https://www.wkf.net/structure-statutes-rules
Karate at Olympic Games Tokyo 2020: KATA RULES. (2021, August 2). [Video]. YouTube. https://www.youtube.com/watch?time_continue=5&v=CAlYGS_KX0o&feature=emb_title
Novosad A, Argajova J, Augustovicova D. New kata evaluation in top-level karate: analysis of frequency and score of katas in K1 Premiere League. Arch Budo 2020; 16: 153-160
WFK - EVENTS ARCHIVE. (n.d.). Sportdata Event Technology. Retrieved July 20, 2021, from https://www.sportdata.org/wkf/set-online/calendar_archiv_main.php
Cerezo, S. C. H. (2010). Estudio Técnico comparado de los Katas de Karate (2nd,Revised éd.). Editorial Alas.

Further information¶

Novosad A, Argajova J, Augustovicova D. New kata evaluation in top-level karate: analysis of frequency and score of katas in K1 Premiere League. Arch Budo 2020; 16: 153-160
Doria, C., Veicsteinas, A., Limonta, E., Maggioni, M. A., Aschieri, P., Eusebi, F., Fanò, G., & Pietrangelo, T. (2009). Energetics of karate (kata and kumite techniques) in top-level athletes. European Journal of Applied Physiology, 107(5), 603‑610.
Estimating the Level of Motor Memory Using Suggested E-Measurement and Its Relation with Decisions of Kata Judges in Karate. (2012). Journal of Applied Sports Science, 2(1), 29‑37.
The Effect of Musical Rhythm Use on Motor Coordination and Performance Score of Team Kata for Karate Performers. Assiut Journal of Sport Science and Arts, 113(1), 222‑235.
Layton, C. (1993). Blocking and Countering in Traditional Shotokan Karate Kata. Perceptual and Motor Skills, 76(2), 641‑642.
Lisowska, A., Ogurkowska, M. B., & Gabryelski, J. (2017). Analysis of the occurrence of musculoskeletal pain in Shotokan karate kata athletes. Journal of Combat Sports and Martial Arts, 2(8), 77‑82.
Samir Yousef Abdelaziz, A. (2018). The effect of a mental training program on enhancing some mental skills, kinematic variables and Kata performance level for Karate juniors. Assiut Journal of Sport Science and Arts, 2018(2), 1‑33.

Abstract

Table of Contents

Introduction¶

What are the kata competitions?¶

Purpose of this notebook¶

Data collecting¶

Load data¶

Dataset description¶

Most popular katas¶

Pie chart¶

Katas preferences by category¶

Pearson's chi-squared test¶

Pearson's chi-squared test results¶

Analysis track¶

Athletes' strategies¶

Kata per round: visualization¶

Kata per round: analysis track¶

Scores analysis¶

Score by category¶

Score by kata¶

Shotokan and Shitoryu: Unequal scores¶

F-test of equality of variances¶

F-test results¶

T-test of comparison of means¶

T-test results¶

Style scores according to the category¶

Non-normality of the data¶

Visual check of normality¶

Mathematical test of normality¶

Shapiro-Wilk test results:¶

Mann-Whitney U-test¶

Mann-Whitney U-test results¶

Mann-Whitney U-test by category¶

Conclusion¶

References¶

Further information¶