Auto machine learning tools to distinguish between two killer whale ecotypes

IF 2 3区生物学 Q2 MARINE & FRESHWATER BIOLOGY Marine Mammal Science Pub Date : 2024-08-29 DOI:10.1111/mms.13175

Mohamed E. Ismail, Ivan D. Fedutin, Erich Hoyt, Tatiana V. Ivkovich, Olga A. Filatova

{"title":"Auto machine learning tools to distinguish between two killer whale ecotypes","authors":"Mohamed E. Ismail, Ivan D. Fedutin, Erich Hoyt, Tatiana V. Ivkovich, Olga A. Filatova","doi":"10.1111/mms.13175","DOIUrl":null,"url":null,"abstract":"The killer whale, despite being considered a single species, exhibits various ecotypes (genetically and ecologically distinct populations), that focus on a specific type of prey (Ford et al., 1998, 2000; Pitman et al., 2011; Pitman & Ensor, 2003; Saulitis et al., 2000). In the northwestern Pacific, killer whales comprise two ecotypes: residents or R-type (fish-eaters) and transients, also called Bigg's killer whales, or T-type (mammal-eaters) (Filatova et al., 2018, 2019; Ismail et al., 2023). These ecotypes are frequently found in the same areas, but they do not engage in social activities and are reproductively isolated (Filatova, Borisova, et al., 2015; Foote et al., 2011; Morin et al., 2010). This isolation is linked to significant variations in their morphology (Baird & Stacey, 1988; Kotik et al., 2023), ecology (Bigg, 1987), behavior (Morton, 1990), acoustic communication (Deecke et al., 2005; Filatova, Fedutin, et al., 2015; Foote & Nystuen, 2008), social structure (Baird & Dill, 1996), diet (Borisova et al., 2020; Filatova et al., 2023; Herman et al., 2005), and other aspects. The genetic distinction between the ecotypes has been described both for eastern and western North Pacific (Filatova, Borisova, et al., 2015; Hoelzel et al., 2007; Morin et al., 2010; Parsons et al., 2013), but the morphological variation was studied mostly in the eastern North Pacific (Baird & Stacey, 1988; Emmons et al., 2019; Kotik et al., 2023; Perrin et al., 2009). Based on these differences, a recent paper suggested to recognize them as different species (Morin et al., 2024).Even with these differences, Russian fisheries institutes have been refusing to recognize the existence of two separate ecotypes and the need for their separate assessment. For example, Boltnev (2017) claimed that ecotypes are an artifact of research methods or even a figment of the imagination of the scientists who described this phenomenon. For this reason, VNIRO (Russian Federal Research Institute of Fisheries and Oceanography) still estimates the abundance of both killer whale ecotypes as a single population. This is partly due to the fact that morphological differences between ecotypes are not immediately obvious to a non-specialist when observing whales at sea. Unfortunately, to date, there are no automated techniques capable of easily identifying these two ecotypes in photos without the time-consuming process of digitizing fin contours.Machine learning (ML), a subfield of artificial intelligence, especially convolutional neural network (CNN), is often used as a preferred model for image processing applications. ML has proven its success in various tasks (Krizhevsky et al., 2017), such as image classification (He et al., 2016), image segmentation (Long et al., 2015), and object recognition (Redmon et al., 2016). Deep learning neural networks have been used as a tool in the photo-identification technique on various species of marine mammals, including right whales (Eubalaena spp.; Bogucki et al., 2019), humpback whales (Megaptera novaeangliae; Cheeseman et al., 2023; Wang et al., 2020), common dolphins (Delphinus delphis; Bouma et al., 2018), blue whales (Balaenoptera musculus; Ramos-Arredondo et al., 2020), killer whales (Orcinus orca; Bergler et al., 2021), and common bottlenose dolphins (Tursiops truncatus; Thompson et al., 2019). However, the most common application has been individual recognition, rather than ecotype identification. Distinctive features of resident (R-type) and transient (T-type) killer whales are dorsal fin and saddle patch shape: transient killer whales have wider and more triangular dorsal fins and large closed saddle patches, while residents have more rounded fin tips and highly variable saddle patch shape (Ford et al., 2000). Emmons et al. (2019) aimed to discern between eastern North Pacific ecotypes using elliptical Fourier analysis. However, this algorithm requires time-consuming image preprocessing, which makes this method impractical. On the other hand, machine learning algorithms demonstrate considerable promise for identification purposes. However, creating an efficient machine learning model using traditional methods has proven to be a formidable challenge due to the complication of the algorithms and the architecture of deep learning convolutional neural networks (CNNs; Rawat & Wang, 2017). The progression of auto machine learning (AutoML) technology has reduced this obstacle. AutoML simplifies model creation and improves accuracy (Borkowski et al., 2019). Our study aims to use AutoML technology to differentiate between the western North Pacific killer whale ecotypes using raster images obtained through field surveys. This emphasizes the presence of two different ecotypes of killer whales in the western North Pacific Ocean. Using AutoML to detect these differences provides objective validation, ensuring observed variances are accurate and not subjective. Data were obtained in the Northwestern Pacific Ocean from 2000 to 2022 using different types of surveys. These varied from vessel-based cetacean surveys conducted along the coast of Eastern Kamchatka, the Kuril Islands, Sakhalin Island, the Commander Islands, Chukotka, and in the Okhotsk Sea, to camp-based observations with daily small boat surveys performed during the summer months in the Avacha Gulf, and off the Commander Islands and Chukotka (Figure 1).The photographs were taken from known groups that were a part of a long-term study by the Far East Russia Orca Project (FEROP), with the ecotype of these groups determined through morphology, behavior, and observations of hunting on specific prey, and confirmed by genetic analysis (Filatova, Borisova, et al., 2015). For more details on ecotype identification, see Filatova et al. (2019). The grading of the photographs for the analysis was based on their quality, evaluated through clarity, angle, and the whale's distance from the camera. This required the photographs to be focused, well-lit without shadows, and aligned with the camera's plane. Any photographs containing obstructions, such as reflections, distracting backgrounds, or water splashes that could conceal the focused part, were excluded. Although all photographs contained both the dorsal fin and saddle patch, in many photographs, only a part of the saddle patch was visible. Photographs with more than one animal in the same frame or photographs of the same animal taken within less than one second were excluded. After the selection process, photographs were cropped to center the subject using ACDSee Photo Studio Ultimate and edited in Photoshop CC to adjust brightness and contrast before being converted to grayscale to minimize color distractions. The analysis focused exclusively on photographs of adult females and “others,” explicitly excluding adult males, calves, and juveniles. The term “others” refers to individuals without the distinctive elongated dorsal fin of adult or subadult males. We did not analyze male photographs because the number of good-quality images of T-type males was too small for inclusion in the analysis. In total, 1,084 images (542 for each ecotype) were used for the analysis. Among the R-type photographs, there were 250 individuals, whereas the T-type photographs included 197 individuals.Almost the same procedures were implemented on the Google Cloud AutoML platform, with some variations unique to the platform. The Google Cloud AutoML platform represents a service called vertex AI that lets you train and deploy ML models. Google Cloud AutoML Vertex AI is an online and paid service, but the first 3 months are offered as a free service with a USD $300 credit. The default data set split settings were accepted: 80% of the photographs from each ecotype were employed for training, 10% for the validation, whereas the other 10% were used to test the model. A model optimization with higher accuracy was selected, along with a default training node budget set at 8 hr.Additionally, to ensure that the performance of our model was not an artifact and its ability to differentiate between the dorsal fins of both R-type and T-type ecotypes was really based on morphological differences between them, we trained another model using randomized groups. Photographs from both ecotypes were mixed, and then they were randomly divided into two separate groups (group 1 and group 2), ensuring that each group contained an equal number of photographs from both ecotypes. Then these randomized groups were also inserted into both platforms and the model was applied. The same photographs and the same numbers were used.The model in Edge Impulse was trained using 80% (434) of the photographs for each ecotype, while the remaining 20% (108) were set aside specifically for testing. After the training stage, the model achieved an accuracy rate of 90.8% (Table 1a, Figure 2a). Upon testing, it successfully identified 91.7% of the R-type photographs and accurately recognized 94.4% of the T-type photographs. Taken together, the model attained an accuracy of 93.06%. Additionally, the model was unable to identify 5.6% of the R-type photographs and 2.8% of the T-type photographs, which were classified as uncertain. This could be attributed to several factors, such as photographs containing features that are not distinctly characteristic of any single class but instead share attributes with multiple classes, or simply due to the quality of the photographs (Table 1a, Figure 2b). The randomized groups revealed that the model faced difficulty in distinguishing differences between the two groups. It achieved an accuracy of 51.1% during the training stage and 7.48% during model testing (Table 1b, Figure 2c,d).On Google Cloud Platform, 10% of the photographs were used as a validation set to refine and enhance the model's training performance, ensuring its readiness for the testing step. Once training was complete, the model used the test set (10%) to provide the final evaluation metrics. The model achieved an average accuracy of 98.17%, measured by the area under the precision-recall curve (AuPRC; Figure 3a). Precision represents the accuracy of the model in identifying a specific class (R-type or T-type). It ensures that a photo classified as R-type (or T-type) is indeed R-type (or T-type). Recall measures the model's ability to capture all instances of a given class without missing them. The AuPRC indicates how well the model balances precision and recall across different thresholds. When the AuPRC is high, it means that the model is effectively and precisely classifying killer whale photographs into R-type and T-type (Figure 3a,b). The confusion matrix in Figure 4 shows where misclassifications occur and how frequently the model predicts the correct class. Moreover, the randomized group in the Google Cloud Auto ML platform revealed that the model was unable to distinguish the differences between the two groups (Figures 3c,d, 4b). These outcomes confirm the precision of our model.The accuracy of the models for both platforms, Edge Impulse and Google Cloud AutoML Vertex AI, is robust and indicates that they performed well. Despite the relatively small data set (1,084 photographs: 542 for each ecotype), that did not include adult males, calves, and juveniles, the trained models were able to identify and learn the patterns and features that differentiate R-type photographs from T-type photographs, enabling us to classify them. This study builds on the findings (Wäldchen & Mäder, 2018; Weinstein, 2018) that machine learning provides a powerful alternative for image classification to differentiate ecotypes, species, and even subspecies. On the other hand, a study that has employed elliptical Fourier analysis to differentiate between the killer whale ecotypes achieved only 70% accuracy for dorsal fin contours and 58% accuracy for saddle patch contours (Emmons et al., 2019). This shows the challenges faced by nonmachine learning approaches in achieving high accuracy in differentiating morphological features. Various machine-learning models were able to classify species in vector mosquitoes, even though there was significant interspecies similarity and intraspecies variation (Park et al., 2020). It was noticed that deep learning models achieved high classification accuracy by using morphological characteristics similar to those used by human experts in the classification process. Various studies employed the same approach to distinguish between different types of organisms, including birds, insects, fish, plants, and even invertebrates (see Table 2). Using artificial intelligence (AI) to detect these differences serves as an impartial validation to help ensure that the variances observed by researchers are not just assumptions or subjective interpretations and that AI has a potential to achieve similar or higher accuracy. When choosing between Edge Impulse and Google Cloud AutoML for image classification, both platforms effectively supported the study and presented the results adequately, with minor differences that did not impact the overall outcome. On the other hand, Google Cloud AutoML and other similar platforms, such as Amazon Web Services (AWS) and Microsoft Azure, are free only within certain limits. Exceeding these limits will lead to charges, while the Edge Impulse platform is completely free, without any limits.Many authors have suggested that R-type and T-type killer whales need to be considered as different species or subspecies (Baird & Stacey, 1988; Morin et al., 2010, 2024; Reeves et al., 2004). The two ecotypes are socially and genetically isolated (Filatova, Borisova, et al., 2015; Hoelzel et al., 2002, 2007; Miller et al., 2010; Morin et al., 2024; Riesch et al., 2012). The present study supports the differentiation of ecotypes, confirming stable morphological differences between them.Our results have a high practical value for killer whale management in the western North Pacific because they emphasize the existence of two separate ecotypes in the western North Pacific, which has been denied by Russian fisheries institutes. One of the arguments for this approach has been the lack of reliable morphological differences that would allow to visually distinguish the ecotypes without observing their behavior or performing genetic analysis of biopsy samples. Our work clearly demonstrates that these differences exist and can be used by machine learning tools, meaning that they are real, objective, and reliable. Our current study used only high-quality photographs due to the limitations of machine learning models in handling images of varying quality, which can affect the accuracy of classification and pattern recognition. Future studies will aim to include a wider variety of photo qualities and photographs of adult males to develop a more robust and generalizable model. This will enable inexperienced observers, such as fisheries inspectors or coastguard officers, to use pretrained neural networks to identify the ecotype of killer whales.To conclude, this study introduces novel machine learning techniques that are quick and affordable for classifying killer whale ecotypes based on morphometric variables. Additionally, it demonstrates that artificial intelligence can easily distinguish between the Russian R-type and T-type killer whales, despite previous claims that they are indistinguishable (Boltnev, 2017). Presenting findings like this aims to reduce the gap between the fields of conservation science and machine learning, while also inspiring the adoption of similar approaches to the study of other species.Mohamed Elsayed Ismail: Formal analysis; investigation; resources; visualization; writing – original draft; writing – review and editing. Ivan D. Fedutin: Resources. Erich Hoyt: Funding acquisition; writing – original draft; writing – review and editing. Tatiana V. Ivkovich: Resources. Olga A. Filatova: Conceptualization; funding acquisition; supervision; writing – original draft; writing – review and editing.","PeriodicalId":18725,"journal":{"name":"Marine Mammal Science","volume":"41 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/mms.13175","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Marine Mammal Science","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/mms.13175","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MARINE & FRESHWATER BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

The killer whale, despite being considered a single species, exhibits various ecotypes (genetically and ecologically distinct populations), that focus on a specific type of prey (Ford et al., 1998, 2000; Pitman et al., 2011; Pitman & Ensor, 2003; Saulitis et al., 2000). In the northwestern Pacific, killer whales comprise two ecotypes: residents or R-type (fish-eaters) and transients, also called Bigg's killer whales, or T-type (mammal-eaters) (Filatova et al., 2018, 2019; Ismail et al., 2023). These ecotypes are frequently found in the same areas, but they do not engage in social activities and are reproductively isolated (Filatova, Borisova, et al., 2015; Foote et al., 2011; Morin et al., 2010). This isolation is linked to significant variations in their morphology (Baird & Stacey, 1988; Kotik et al., 2023), ecology (Bigg, 1987), behavior (Morton, 1990), acoustic communication (Deecke et al., 2005; Filatova, Fedutin, et al., 2015; Foote & Nystuen, 2008), social structure (Baird & Dill, 1996), diet (Borisova et al., 2020; Filatova et al., 2023; Herman et al., 2005), and other aspects. The genetic distinction between the ecotypes has been described both for eastern and western North Pacific (Filatova, Borisova, et al., 2015; Hoelzel et al., 2007; Morin et al., 2010; Parsons et al., 2013), but the morphological variation was studied mostly in the eastern North Pacific (Baird & Stacey, 1988; Emmons et al., 2019; Kotik et al., 2023; Perrin et al., 2009). Based on these differences, a recent paper suggested to recognize them as different species (Morin et al., 2024).

Even with these differences, Russian fisheries institutes have been refusing to recognize the existence of two separate ecotypes and the need for their separate assessment. For example, Boltnev (2017) claimed that ecotypes are an artifact of research methods or even a figment of the imagination of the scientists who described this phenomenon. For this reason, VNIRO (Russian Federal Research Institute of Fisheries and Oceanography) still estimates the abundance of both killer whale ecotypes as a single population. This is partly due to the fact that morphological differences between ecotypes are not immediately obvious to a non-specialist when observing whales at sea. Unfortunately, to date, there are no automated techniques capable of easily identifying these two ecotypes in photos without the time-consuming process of digitizing fin contours.

Machine learning (ML), a subfield of artificial intelligence, especially convolutional neural network (CNN), is often used as a preferred model for image processing applications. ML has proven its success in various tasks (Krizhevsky et al., 2017), such as image classification (He et al., 2016), image segmentation (Long et al., 2015), and object recognition (Redmon et al., 2016). Deep learning neural networks have been used as a tool in the photo-identification technique on various species of marine mammals, including right whales (Eubalaena spp.; Bogucki et al., 2019), humpback whales (Megaptera novaeangliae; Cheeseman et al., 2023; Wang et al., 2020), common dolphins (Delphinus delphis; Bouma et al., 2018), blue whales (Balaenoptera musculus; Ramos-Arredondo et al., 2020), killer whales (Orcinus orca; Bergler et al., 2021), and common bottlenose dolphins (Tursiops truncatus; Thompson et al., 2019). However, the most common application has been individual recognition, rather than ecotype identification. Distinctive features of resident (R-type) and transient (T-type) killer whales are dorsal fin and saddle patch shape: transient killer whales have wider and more triangular dorsal fins and large closed saddle patches, while residents have more rounded fin tips and highly variable saddle patch shape (Ford et al., 2000). Emmons et al. (2019) aimed to discern between eastern North Pacific ecotypes using elliptical Fourier analysis. However, this algorithm requires time-consuming image preprocessing, which makes this method impractical. On the other hand, machine learning algorithms demonstrate considerable promise for identification purposes. However, creating an efficient machine learning model using traditional methods has proven to be a formidable challenge due to the complication of the algorithms and the architecture of deep learning convolutional neural networks (CNNs; Rawat & Wang, 2017). The progression of auto machine learning (AutoML) technology has reduced this obstacle. AutoML simplifies model creation and improves accuracy (Borkowski et al., 2019). Our study aims to use AutoML technology to differentiate between the western North Pacific killer whale ecotypes using raster images obtained through field surveys. This emphasizes the presence of two different ecotypes of killer whales in the western North Pacific Ocean. Using AutoML to detect these differences provides objective validation, ensuring observed variances are accurate and not subjective. Data were obtained in the Northwestern Pacific Ocean from 2000 to 2022 using different types of surveys. These varied from vessel-based cetacean surveys conducted along the coast of Eastern Kamchatka, the Kuril Islands, Sakhalin Island, the Commander Islands, Chukotka, and in the Okhotsk Sea, to camp-based observations with daily small boat surveys performed during the summer months in the Avacha Gulf, and off the Commander Islands and Chukotka (Figure 1).

The photographs were taken from known groups that were a part of a long-term study by the Far East Russia Orca Project (FEROP), with the ecotype of these groups determined through morphology, behavior, and observations of hunting on specific prey, and confirmed by genetic analysis (Filatova, Borisova, et al., 2015). For more details on ecotype identification, see Filatova et al. (2019). The grading of the photographs for the analysis was based on their quality, evaluated through clarity, angle, and the whale's distance from the camera. This required the photographs to be focused, well-lit without shadows, and aligned with the camera's plane. Any photographs containing obstructions, such as reflections, distracting backgrounds, or water splashes that could conceal the focused part, were excluded. Although all photographs contained both the dorsal fin and saddle patch, in many photographs, only a part of the saddle patch was visible. Photographs with more than one animal in the same frame or photographs of the same animal taken within less than one second were excluded. After the selection process, photographs were cropped to center the subject using ACDSee Photo Studio Ultimate and edited in Photoshop CC to adjust brightness and contrast before being converted to grayscale to minimize color distractions. The analysis focused exclusively on photographs of adult females and “others,” explicitly excluding adult males, calves, and juveniles. The term “others” refers to individuals without the distinctive elongated dorsal fin of adult or subadult males. We did not analyze male photographs because the number of good-quality images of T-type males was too small for inclusion in the analysis. In total, 1,084 images (542 for each ecotype) were used for the analysis. Among the R-type photographs, there were 250 individuals, whereas the T-type photographs included 197 individuals.

Almost the same procedures were implemented on the Google Cloud AutoML platform, with some variations unique to the platform. The Google Cloud AutoML platform represents a service called vertex AI that lets you train and deploy ML models. Google Cloud AutoML Vertex AI is an online and paid service, but the first 3 months are offered as a free service with a USD $300 credit. The default data set split settings were accepted: 80% of the photographs from each ecotype were employed for training, 10% for the validation, whereas the other 10% were used to test the model. A model optimization with higher accuracy was selected, along with a default training node budget set at 8 hr.

Additionally, to ensure that the performance of our model was not an artifact and its ability to differentiate between the dorsal fins of both R-type and T-type ecotypes was really based on morphological differences between them, we trained another model using randomized groups. Photographs from both ecotypes were mixed, and then they were randomly divided into two separate groups (group 1 and group 2), ensuring that each group contained an equal number of photographs from both ecotypes. Then these randomized groups were also inserted into both platforms and the model was applied. The same photographs and the same numbers were used.

The model in Edge Impulse was trained using 80% (434) of the photographs for each ecotype, while the remaining 20% (108) were set aside specifically for testing. After the training stage, the model achieved an accuracy rate of 90.8% (Table 1a, Figure 2a). Upon testing, it successfully identified 91.7% of the R-type photographs and accurately recognized 94.4% of the T-type photographs. Taken together, the model attained an accuracy of 93.06%. Additionally, the model was unable to identify 5.6% of the R-type photographs and 2.8% of the T-type photographs, which were classified as uncertain. This could be attributed to several factors, such as photographs containing features that are not distinctly characteristic of any single class but instead share attributes with multiple classes, or simply due to the quality of the photographs (Table 1a, Figure 2b). The randomized groups revealed that the model faced difficulty in distinguishing differences between the two groups. It achieved an accuracy of 51.1% during the training stage and 7.48% during model testing (Table 1b, Figure 2c,d).

On Google Cloud Platform, 10% of the photographs were used as a validation set to refine and enhance the model's training performance, ensuring its readiness for the testing step. Once training was complete, the model used the test set (10%) to provide the final evaluation metrics. The model achieved an average accuracy of 98.17%, measured by the area under the precision-recall curve (AuPRC; Figure 3a). Precision represents the accuracy of the model in identifying a specific class (R-type or T-type). It ensures that a photo classified as R-type (or T-type) is indeed R-type (or T-type). Recall measures the model's ability to capture all instances of a given class without missing them. The AuPRC indicates how well the model balances precision and recall across different thresholds. When the AuPRC is high, it means that the model is effectively and precisely classifying killer whale photographs into R-type and T-type (Figure 3a,b). The confusion matrix in Figure 4 shows where misclassifications occur and how frequently the model predicts the correct class. Moreover, the randomized group in the Google Cloud Auto ML platform revealed that the model was unable to distinguish the differences between the two groups (Figures 3c,d, 4b). These outcomes confirm the precision of our model.

The accuracy of the models for both platforms, Edge Impulse and Google Cloud AutoML Vertex AI, is robust and indicates that they performed well. Despite the relatively small data set (1,084 photographs: 542 for each ecotype), that did not include adult males, calves, and juveniles, the trained models were able to identify and learn the patterns and features that differentiate R-type photographs from T-type photographs, enabling us to classify them. This study builds on the findings (Wäldchen & Mäder, 2018; Weinstein, 2018) that machine learning provides a powerful alternative for image classification to differentiate ecotypes, species, and even subspecies. On the other hand, a study that has employed elliptical Fourier analysis to differentiate between the killer whale ecotypes achieved only 70% accuracy for dorsal fin contours and 58% accuracy for saddle patch contours (Emmons et al., 2019). This shows the challenges faced by nonmachine learning approaches in achieving high accuracy in differentiating morphological features. Various machine-learning models were able to classify species in vector mosquitoes, even though there was significant interspecies similarity and intraspecies variation (Park et al., 2020). It was noticed that deep learning models achieved high classification accuracy by using morphological characteristics similar to those used by human experts in the classification process. Various studies employed the same approach to distinguish between different types of organisms, including birds, insects, fish, plants, and even invertebrates (see Table 2). Using artificial intelligence (AI) to detect these differences serves as an impartial validation to help ensure that the variances observed by researchers are not just assumptions or subjective interpretations and that AI has a potential to achieve similar or higher accuracy. When choosing between Edge Impulse and Google Cloud AutoML for image classification, both platforms effectively supported the study and presented the results adequately, with minor differences that did not impact the overall outcome. On the other hand, Google Cloud AutoML and other similar platforms, such as Amazon Web Services (AWS) and Microsoft Azure, are free only within certain limits. Exceeding these limits will lead to charges, while the Edge Impulse platform is completely free, without any limits.

Many authors have suggested that R-type and T-type killer whales need to be considered as different species or subspecies (Baird & Stacey, 1988; Morin et al., 2010, 2024; Reeves et al., 2004). The two ecotypes are socially and genetically isolated (Filatova, Borisova, et al., 2015; Hoelzel et al., 2002, 2007; Miller et al., 2010; Morin et al., 2024; Riesch et al., 2012). The present study supports the differentiation of ecotypes, confirming stable morphological differences between them.

Our results have a high practical value for killer whale management in the western North Pacific because they emphasize the existence of two separate ecotypes in the western North Pacific, which has been denied by Russian fisheries institutes. One of the arguments for this approach has been the lack of reliable morphological differences that would allow to visually distinguish the ecotypes without observing their behavior or performing genetic analysis of biopsy samples. Our work clearly demonstrates that these differences exist and can be used by machine learning tools, meaning that they are real, objective, and reliable. Our current study used only high-quality photographs due to the limitations of machine learning models in handling images of varying quality, which can affect the accuracy of classification and pattern recognition. Future studies will aim to include a wider variety of photo qualities and photographs of adult males to develop a more robust and generalizable model. This will enable inexperienced observers, such as fisheries inspectors or coastguard officers, to use pretrained neural networks to identify the ecotype of killer whales.

To conclude, this study introduces novel machine learning techniques that are quick and affordable for classifying killer whale ecotypes based on morphometric variables. Additionally, it demonstrates that artificial intelligence can easily distinguish between the Russian R-type and T-type killer whales, despite previous claims that they are indistinguishable (Boltnev, 2017). Presenting findings like this aims to reduce the gap between the fields of conservation science and machine learning, while also inspiring the adoption of similar approaches to the study of other species.

Mohamed Elsayed Ismail: Formal analysis; investigation; resources; visualization; writing – original draft; writing – review and editing. Ivan D. Fedutin: Resources. Erich Hoyt: Funding acquisition; writing – original draft; writing – review and editing. Tatiana V. Ivkovich: Resources. Olga A. Filatova: Conceptualization; funding acquisition; supervision; writing – original draft; writing – review and editing.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

自动机器学习工具区分两种虎鲸生态类型

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Marine Mammal Science 生物-动物学

CiteScore

4.80

自引率

8.70%

发文量

审稿时长

6-12 weeks

期刊介绍： Published for the Society for Marine Mammalogy, Marine Mammal Science is a source of significant new findings on marine mammals resulting from original research on their form and function, evolution, systematics, physiology, biochemistry, behavior, population biology, life history, genetics, ecology and conservation. The journal features both original and review articles, notes, opinions and letters. It serves as a vital resource for anyone studying marine mammals.