THE APPLIED DATA SCIENCE WORKSHOP: Prostate Cancer Classification and Recognition Using Machine Learning and Deep Learning with Python GUI

THE APPLIED DATA SCIENCE WORKSHOP: Prostate Cancer Classification and Recognition Using Machine Learning and Deep Learning with Python GUI

Author: Vivian Siahaan

Publisher: BALIGE PUBLISHING

Published: 2023-07-19

Total Pages: 357

ISBN-13:

DOWNLOAD EBOOK

The Applied Data Science Workshop on Prostate Cancer Classification and Recognition using Machine Learning and Deep Learning with Python GUI involved several steps and components. The project aimed to analyze prostate cancer data, explore the features, develop machine learning models, and create a graphical user interface (GUI) using PyQt5. The project began with data exploration, where the prostate cancer dataset was examined to understand its structure and content. Various statistical techniques were employed to gain insights into the data, such as checking the dimensions, identifying missing values, and examining the distribution of the target variable. The next step involved exploring the distribution of features in the dataset. Visualizations were created to analyze the characteristics and relationships between different features. Histograms, scatter plots, and correlation matrices were used to uncover patterns and identify potential variables that may contribute to the classification of prostate cancer. Machine learning models were then developed to classify prostate cancer based on the available features. Several algorithms, including Logistic Regression, K-Nearest Neighbors, Decision Trees, Random Forests, Gradient Boosting, Naive Bayes, Adaboost, Extreme Gradient Boosting, Light Gradient Boosting, and Multi-Layer Perceptron (MLP), were implemented. Each model was trained and evaluated using appropriate techniques such as cross-validation and grid search for hyperparameter tuning. The performance of each machine learning model was assessed using evaluation metrics such as accuracy, precision, recall, and F1-score. These metrics provided insights into the effectiveness of the models in accurately classifying prostate cancer cases. Model comparison and selection were based on their performance and the specific requirements of the project. In addition to the machine learning models, a deep learning model based on an Artificial Neural Network (ANN) was implemented. The ANN architecture consisted of multiple layers, including input, hidden, and output layers. The ANN model was trained using the dataset, and its performance was evaluated using accuracy and loss metrics. To provide a user-friendly interface for the project, a GUI was designed using PyQt, a Python library for creating desktop applications. The GUI allowed users to interact with the machine learning models and perform tasks such as selecting the prediction method, loading data, training models, and displaying results. The GUI included various graphical components such as buttons, combo boxes, input fields, and plot windows. These components were designed to facilitate data loading, model training, and result visualization. Users could choose the prediction method, view accuracy scores, classification reports, and confusion matrices, and explore the predicted values compared to the actual values. The GUI also incorporated interactive features such as real-time updates of prediction results based on user selections and dynamic plot generation for visualizing model performance. Users could switch between different prediction methods, observe changes in accuracy, and examine the history of training loss and accuracy through plotted graphs. Data preprocessing techniques, such as standardization and normalization, were applied to ensure the consistency and reliability of the machine learning and deep learning models. The dataset was divided into training and testing sets to assess model performance on unseen data and detect overfitting or underfitting. Model persistence was implemented to save the trained machine learning and deep learning models to disk, allowing for easy retrieval and future use. The saved models could be loaded and utilized within the GUI for prediction tasks without the need for retraining. Overall, the Applied Data Science Workshop on Prostate Cancer Classification and Recognition provided a comprehensive framework for analyzing prostate cancer data, developing machine learning and deep learning models, and creating an interactive GUI. The project aimed to assist in the accurate classification and recognition of prostate cancer cases, facilitating informed decision-making and potentially contributing to improved patient outcomes.


PYTHON GUI PROJECTS WITH MACHINE LEARNING AND DEEP LEARNING

PYTHON GUI PROJECTS WITH MACHINE LEARNING AND DEEP LEARNING

Author: Vivian Siahaan

Publisher: BALIGE PUBLISHING

Published: 2022-01-16

Total Pages: 917

ISBN-13:

DOWNLOAD EBOOK

PROJECT 1: THE APPLIED DATA SCIENCE WORKSHOP: Prostate Cancer Classification and Recognition Using Machine Learning and Deep Learning with Python GUI Prostate cancer is cancer that occurs in the prostate. The prostate is a small walnut-shaped gland in males that produces the seminal fluid that nourishes and transports sperm. Prostate cancer is one of the most common types of cancer. Many prostate cancers grow slowly and are confined to the prostate gland, where they may not cause serious harm. However, while some types of prostate cancer grow slowly and may need minimal or even no treatment, other types are aggressive and can spread quickly. The dataset used in this project consists of 100 patients which can be used to implement the machine learning and deep learning algorithms. The dataset consists of 100 observations and 10 variables (out of which 8 numeric variables and one categorical variable and is ID) which are as follows: Id, Radius, Texture, Perimeter, Area, Smoothness, Compactness, Diagnosis Result, Symmetry, and Fractal Dimension. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will develop a GUI using PyQt5 to plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 2: THE APPLIED DATA SCIENCE WORKSHOP: Urinary Biomarkers Based Pancreatic Cancer Classification and Prediction Using Machine Learning with Python GUI Pancreatic cancer is an extremely deadly type of cancer. Once diagnosed, the five-year survival rate is less than 10%. However, if pancreatic cancer is caught early, the odds of surviving are much better. Unfortunately, many cases of pancreatic cancer show no symptoms until the cancer has spread throughout the body. A diagnostic test to identify people with pancreatic cancer could be enormously helpful. In a paper by Silvana Debernardi and colleagues, published this year in the journal PLOS Medicine, a multi-national team of researchers sought to develop an accurate diagnostic test for the most common type of pancreatic cancer, called pancreatic ductal adenocarcinoma or PDAC. They gathered a series of biomarkers from the urine of three groups of patients: Healthy controls, Patients with non-cancerous pancreatic conditions, like chronic pancreatitis, and Patients with pancreatic ductal adenocarcinoma. When possible, these patients were age- and sex-matched. The goal was to develop an accurate way to identify patients with pancreatic cancer. The key features are four urinary biomarkers: creatinine, LYVE1, REG1B, and TFF1. Creatinine is a protein that is often used as an indicator of kidney function. YVLE1 is lymphatic vessel endothelial hyaluronan receptor 1, a protein that may play a role in tumor metastasis. REG1B is a protein that may be associated with pancreas regeneration. TFF1 is trefoil factor 1, which may be related to regeneration and repair of the urinary tract. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, XGB classifier, and MLP classifier. Finally, you will develop a GUI using PyQt5 to plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 3: DATA SCIENCE CRASH COURSE: Voice Based Gender Classification and Prediction Using Machine Learning and Deep Learning with Python GUI This dataset was created to identify a voice as male or female, based upon acoustic properties of the voice and speech. The dataset consists of 3,168 recorded voice samples, collected from male and female speakers. The voice samples are pre-processed by acoustic analysis in R using the seewave and tuneR packages, with an analyzed frequency range of 0hz-280hz (human vocal range). The following acoustic properties of each voice are measured and included within the CSV: meanfreq: mean frequency (in kHz); sd: standard deviation of frequency; median: median frequency (in kHz); Q25: first quantile (in kHz); Q75: third quantile (in kHz); IQR: interquantile range (in kHz); skew: skewness; kurt: kurtosis; sp.ent: spectral entropy; sfm: spectral flatness; mode: mode frequency; centroid: frequency centroid (see specprop); peakf: peak frequency (frequency with highest energy); meanfun: average of fundamental frequency measured across acoustic signal; minfun: minimum fundamental frequency measured across acoustic signal; maxfun: maximum fundamental frequency measured across acoustic signal; meandom: average of dominant frequency measured across acoustic signal; mindom: minimum of dominant frequency measured across acoustic signal; maxdom: maximum of dominant frequency measured across acoustic signal; dfrange: range of dominant frequency measured across acoustic signal; modindx: modulation index. Calculated as the accumulated absolute difference between adjacent measurements of fundamental frequencies divided by the frequency range; and label: male or female. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will develop a GUI using PyQt5 to plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 4: DATA SCIENCE CRASH COURSE: Thyroid Disease Classification and Prediction Using Machine Learning and Deep Learning with Python GUI Thyroid disease is a general term for a medical condition that keeps your thyroid from making the right amount of hormones. Thyroid typically makes hormones that keep body functioning normally. When the thyroid makes too much thyroid hormone, body uses energy too quickly. The two main types of thyroid disease are hypothyroidism and hyperthyroidism. Both conditions can be caused by other diseases that impact the way the thyroid gland works. Dataset used in this project was from Garavan Institute Documentation as given by Ross Quinlan 6 databases from the Garavan Institute in Sydney, Australia. Approximately the following for each database: 2800 training (data) instances and 972 test instances. This dataset contains plenty of missing data, while 29 or so attributes, either Boolean or continuously-valued. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will develop a GUI using PyQt5 to plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy.


Artificial Intelligence in Medical Imaging

Artificial Intelligence in Medical Imaging

Author: Erik R. Ranschaert

Publisher: Springer

Published: 2019-01-29

Total Pages: 369

ISBN-13: 3319948784

DOWNLOAD EBOOK

This book provides a thorough overview of the ongoing evolution in the application of artificial intelligence (AI) within healthcare and radiology, enabling readers to gain a deeper insight into the technological background of AI and the impacts of new and emerging technologies on medical imaging. After an introduction on game changers in radiology, such as deep learning technology, the technological evolution of AI in computing science and medical image computing is described, with explanation of basic principles and the types and subtypes of AI. Subsequent sections address the use of imaging biomarkers, the development and validation of AI applications, and various aspects and issues relating to the growing role of big data in radiology. Diverse real-life clinical applications of AI are then outlined for different body parts, demonstrating their ability to add value to daily radiology practices. The concluding section focuses on the impact of AI on radiology and the implications for radiologists, for example with respect to training. Written by radiologists and IT professionals, the book will be of high value for radiologists, medical/clinical physicists, IT specialists, and imaging informatics professionals.


Deep Learning and Convolutional Neural Networks for Medical Image Computing

Deep Learning and Convolutional Neural Networks for Medical Image Computing

Author: Le Lu

Publisher: Springer

Published: 2017-07-12

Total Pages: 327

ISBN-13: 331942999X

DOWNLOAD EBOOK

This book presents a detailed review of the state of the art in deep learning approaches for semantic object detection and segmentation in medical image computing, and large-scale radiology database mining. A particular focus is placed on the application of convolutional neural networks, with the theory supported by practical examples. Features: highlights how the use of deep neural networks can address new questions and protocols, as well as improve upon existing challenges in medical image computing; discusses the insightful research experience of Dr. Ronald M. Summers; presents a comprehensive review of the latest research and literature; describes a range of different methods that make use of deep learning for object or landmark detection tasks in 2D and 3D medical imaging; examines a varied selection of techniques for semantic segmentation using deep learning principles in medical imaging; introduces a novel approach to interleaved text and image deep mining on a large-scale radiology image database.


THE APPLIED DATA SCIENCE WORKSHOP: Urinary biomarkers Based Pancreatic Cancer Classification and Prediction Using Machine Learning with Python GUI

THE APPLIED DATA SCIENCE WORKSHOP: Urinary biomarkers Based Pancreatic Cancer Classification and Prediction Using Machine Learning with Python GUI

Author: Vivian Siahaan

Publisher: BALIGE PUBLISHING

Published: 2023-07-23

Total Pages: 327

ISBN-13:

DOWNLOAD EBOOK

The Applied Data Science Workshop on "Urinary Biomarkers-Based Pancreatic Cancer Classification and Prediction Using Machine Learning with Python GUI" embarks on a comprehensive journey, commencing with an in-depth exploration of the dataset. During this initial phase, the structure and size of the dataset are thoroughly examined, and the various features it contains are meticulously studied. The principal objective is to understand the relationship between these features and the target variable, which, in this case, is the diagnosis of pancreatic cancer. The distribution of each feature is analyzed, and potential patterns, trends, or outliers that could significantly impact the model's performance are identified. To ensure the data is in optimal condition for model training, preprocessing steps are undertaken. This involves handling missing values through imputation techniques, such as mean, median, or interpolation, depending on the nature of the data. Additionally, feature engineering is performed to derive new features or transform existing ones, with the aim of enhancing the model's predictive power. In preparation for model building, the dataset is split into training and testing sets. This division is crucial to assess the models' generalization performance on unseen data accurately. To maintain a balanced representation of classes in both sets, stratified sampling is employed, mitigating potential biases in the model evaluation process. The workshop explores an array of machine learning classifiers suitable for pancreatic cancer classification, such as Logistic Regression, K-Nearest Neighbors, Decision Trees, Random Forests, Gradient Boosting, Naive Bayes, Adaboost, Extreme Gradient Boosting, Light Gradient Boosting, Naïve Bayes, and Multi-Layer Perceptron (MLP). For each classifier, three different preprocessing techniques are applied to investigate their impact on model performance: raw (unprocessed data), normalization (scaling data to a similar range), and standardization (scaling data to have zero mean and unit variance). To optimize the classifiers' hyperparameters and boost their predictive capabilities, GridSearchCV, a technique for hyperparameter tuning, is employed. GridSearchCV conducts an exhaustive search over a specified hyperparameter grid, evaluating different combinations to identify the optimal settings for each model and preprocessing technique. During the model evaluation phase, multiple performance metrics are utilized to gauge the efficacy of the classifiers. Commonly used metrics include accuracy, recall, precision, and F1-score. By comprehensively assessing these metrics, the strengths and weaknesses of each model are revealed, enabling a deeper understanding of their performance across different classes of pancreatic cancer. Classification reports are generated to present a detailed breakdown of the models' performance, including precision, recall, F1-score, and support for each class. These reports serve as valuable tools for interpreting model outputs and identifying areas for potential improvement. The workshop highlights the significance of graphical user interfaces (GUIs) in facilitating user interactions with machine learning models. By integrating PyQt, a powerful GUI development library for Python, participants create a user-friendly interface that enables users to interact with the models effortlessly. The GUI provides options to select different preprocessing techniques, visualize model outputs such as confusion matrices and decision boundaries, and gain insights into the models' classification capabilities. One of the primary advantages of the graphical user interface is its ability to offer users a seamless and intuitive experience in predicting and classifying pancreatic cancer based on urinary biomarkers. The GUI empowers users to make informed decisions by allowing them to compare the performance of different classifiers under various preprocessing techniques. Throughout the workshop, a strong emphasis is placed on the significance of proper data preprocessing, hyperparameter tuning, and robust model evaluation. These crucial steps contribute to building accurate and reliable machine learning models for pancreatic cancer prediction. By the culmination of the workshop, participants have gained valuable hands-on experience in data exploration, machine learning model building, hyperparameter tuning, and GUI development, all geared towards addressing the specific challenge of pancreatic cancer classification and prediction. In conclusion, the Applied Data Science Workshop on "Urinary Biomarkers-Based Pancreatic Cancer Classification and Prediction Using Machine Learning with Python GUI" embarks on a comprehensive and transformative journey, bringing together data exploration, preprocessing, machine learning model selection, hyperparameter tuning, model evaluation, and GUI development. The project's focus on pancreatic cancer prediction using urinary biomarkers aligns with the pressing need for early detection and treatment of this deadly disease. As participants delve into the intricacies of machine learning and medical research, they contribute to the broader scientific community's ongoing efforts to combat cancer and improve patient outcomes. Through the integration of data science methodologies and powerful visualization tools, the workshop exemplifies the potential of machine learning in revolutionizing medical diagnostics and healthcare practices.


Machine Learning in Medicine

Machine Learning in Medicine

Author: Ayman El-Baz

Publisher: CRC Press

Published: 2021-08-03

Total Pages: 313

ISBN-13: 1351588745

DOWNLOAD EBOOK

Machine Learning in Medicine covers the state-of-the-art techniques of machine learning and their applications in the medical field. It presents several computer-aided diagnosis (CAD) systems, which have played an important role in the diagnosis of several diseases in the past decade, e.g., cancer detection, resulting in the development of several successful systems. New developments in machine learning may make it possible in the near future to develop machines that are capable of completely performing tasks that currently cannot be completed without human aid, especially in the medical field. This book covers such machines, including convolutional neural networks (CNNs) with different activation functions for small- to medium-size biomedical datasets, detection of abnormal activities stemming from cognitive decline, thermal dose modelling for thermal ablative cancer treatments, dermatological machine learning clinical decision support systems, artificial intelligence-powered ultrasound for diagnosis, practical challenges with possible solutions for machine learning in medical imaging, epilepsy diagnosis from structural MRI, Alzheimer's disease diagnosis, classification of left ventricular hypertrophy, and intelligent medical language understanding. This book will help to advance scientific research within the broad field of machine learning in the medical field. It focuses on major trends and challenges in this area and presents work aimed at identifying new techniques and their use in biomedical analysis, including extensive references at the end of each chapter.


Dermoscopy Image Analysis

Dermoscopy Image Analysis

Author: M. Emre Celebi

Publisher: CRC Press

Published: 2015-10-16

Total Pages: 458

ISBN-13: 1482253275

DOWNLOAD EBOOK

Dermoscopy is a noninvasive skin imaging technique that uses optical magnification and either liquid immersion or cross-polarized lighting to make subsurface structures more easily visible when compared to conventional clinical images. It allows for the identification of dozens of morphological features that are particularly important in identifyin


Recent Advances on Soft Computing and Data Mining

Recent Advances on Soft Computing and Data Mining

Author: Rozaida Ghazali

Publisher: Springer Nature

Published: 2019-12-04

Total Pages: 491

ISBN-13: 3030360563

DOWNLOAD EBOOK

This book provides an introduction to data science and offers a practical overview of the concepts and techniques that readers need to get the most out of their large-scale data mining projects and research studies. It discusses data-analytical thinking, which is essential to extract useful knowledge and obtain commercial value from the data. Also known as data-driven science, soft computing and data mining disciplines cover a broad interdisciplinary range of scientific methods and processes. The book provides readers with sufficient knowledge to tackle a wide range of issues in complex systems, bringing together the scopes that integrate soft computing and data mining in various combinations of applications and practices, since to thrive in these data-driven ecosystems, researchers, data analysts and practitioners must understand the design choice and options of these approaches. This book helps readers to solve complex benchmark problems and to better appreciate the concepts, tools and techniques used.


Data Mining for Biomedical Applications

Data Mining for Biomedical Applications

Author: Jinyan Li

Publisher: Springer Science & Business Media

Published: 2006-03-23

Total Pages: 163

ISBN-13: 3540331042

DOWNLOAD EBOOK

This book constitutes the refereed proceedings of the International Workshop on Data Mining for Biomedical Applications, BioDM 2006, held in Singapore in conjunction with the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2006). The 14 revised full papers presented together with one keynote talk were carefully reviewed and selected from 35 submissions. The papers are organized in topical sections


Deep Learning for Medical Image Analysis

Deep Learning for Medical Image Analysis

Author: S. Kevin Zhou

Publisher: Academic Press

Published: 2023-11-23

Total Pages: 544

ISBN-13: 0323858880

DOWNLOAD EBOOK

Deep Learning for Medical Image Analysis, Second Edition is a great learning resource for academic and industry researchers and graduate students taking courses on machine learning and deep learning for computer vision and medical image computing and analysis. Deep learning provides exciting solutions for medical image analysis problems and is a key method for future applications. This book gives a clear understanding of the principles and methods of neural network and deep learning concepts, showing how the algorithms that integrate deep learning as a core component are applied to medical image detection, segmentation, registration, and computer-aided analysis.· Covers common research problems in medical image analysis and their challenges · Describes the latest deep learning methods and the theories behind approaches for medical image analysis · Teaches how algorithms are applied to a broad range of application areas including cardiac, neural and functional, colonoscopy, OCTA applications and model assessment · Includes a Foreword written by Nicholas Ayache