مقاله رایگان با موضوع شبکه های عصبی

01/2/27 , 12:19 ع نظر

عنوان مقاله:

توصیف شبکه های عصبی با شبکه های عصبی

Explanations for Neural Networks by Neural Networks

سال انتشار: 2022

رشته: مهندسی کامپیوتر

گرایش: هوش مصنوعی

دانلود رایگان این مقاله:

دانلود مقاله شبکه های عصبی

مشاهده سایر مقالات جدید:

مقالات جدید مهندسی کامپیوتر

مقالات جدید هوش مصنوعی

5. Related Approaches

Various methods to interpret black-box models have been proposed in the past decades. Overviews from different perspectives can be found in . According to the taxonomy of Molnar, our approach can be classified as global, post-hoc, and model specific, with an intrinsically interpretable model (i.e., a polynomial function) as interpretation. Therefore, we focus on global interpretation methods with the goal of uncovering the decision-making process of the model based on the impact of the learned parameters (e.g., weights and biases of a neural network) on its features . Global explanations therefore allow us to better understand the relationship and interactions between the features within the learned model. In contrast, local explainability methods such as LIME or SHAP only generate explanations for the models’ prediction on a single instance. Therefore, they aim for local fidelity, which means that the explanation only accounts for this specific instance of interest and does not claim to have a high fidelity for further instances. This is fundamentally different from the goal of global interpretability, where we want to find an interpretation that has a high fidelity for the complete model and not only for a specific instance. There is a wide range of active research in the field of global interpretability. One part of global interpretability research focuses on uncovering the impact of a feature or a set of features on the model’s predictions, as in, for instance, partial dependency plots (PDP), feature interaction , or (permutation) feature importance . Another area of global interpretability is concerned with finding data points that are representative for the learned model in order to increase the interpretability, as in, for instance, prototypes and criticisms. While the previously mentioned approaches are similar to I-Nets as global interpretability methods that can be applied post-hoc, they differ significantly in the results of the interpretation method. As it presents an intrinsically interpretable model as a result of the interpretation, the most relevant work for our paper is work on global surrogate models. Global surrogate models, according to Molnar , are defined as an interpretable model trained to approximate the predictions of a black-box model. Thus, interpretability is achieved by inspecting the parameters of the surrogate model. In the literature, surrogate models are considered as model-agnostic and are trained based on the prediction of the model we want to interpret . Existing approaches usually differ only in the type of surrogate model that is chosen and the training procedure of the model. Different types of surrogate models have been explored, including mathematical functions decision trees or rule sets . While the result of the interpretation (i.e., an intrinsically interpretable model) of global surrogate models matches our approach, there are also major differences. All mentioned approaches require an optimization process during interpretation, whereas our approach transforms the interpretation task into a machine learning problem that is solved using neural networks up-front. Additionally, existing approaches generate interpretations based on samples from the model to be interpreted, making them modelagnostic. In contrast, the I-Net uses the network parameters of the λ network as the basis for generating explanations; therefore, this approach is model-specific by definition. In summary, unlike existing approaches, I-Nets enable real-time interpretations of previously unseen and already trained models by a representation of their network function without relying on any data.

(دقت کنید که این بخش از متن، با استفاده از گوگل ترنسلیت ترجمه شده و توسط مترجمین سایت ای ترجمه، ترجمه نشده است و صرفا جهت آشنایی شما با متن میباشد.)

5. رویکردهای مرتبط

روش های مختلفی برای تفسیر مدل های جعبه سیاه در دهه های گذشته ارائه شده است. بررسی های کلی از دیدگاه های مختلف را می توان در . با توجه به طبقه‌بندی مولنار، رویکرد ما را می‌توان به‌عنوان جهانی، پس‌تک، و مدل خاص، با یک مدل ذاتی قابل تفسیر (به عنوان مثال، یک تابع چند جمله‌ای) به عنوان تفسیر طبقه‌بندی کرد. بنابراین، ما بر روی روش‌های تفسیر جهانی با هدف کشف فرآیند تصمیم‌گیری مدل بر اساس تأثیر پارامترهای آموخته شده (به عنوان مثال، وزن‌ها و سوگیری‌های یک شبکه عصبی) بر ویژگی‌های آن تمرکز می‌کنیم. بنابراین، توضیحات کلی به ما امکان می دهد تا رابطه و تعاملات بین ویژگی های مدل آموخته شده را بهتر درک کنیم. در مقابل، روش‌های توضیح‌پذیری محلی مانند LIME یا SHAP فقط برای پیش‌بینی مدل‌ها در یک نمونه توضیح می‌دهند. بنابراین، آنها وفاداری محلی را هدف قرار می دهند، به این معنی که توضیح فقط این مورد خاص را در نظر می گیرد و برای نمونه های بعدی ادعا نمی کند که وفاداری بالایی دارد. این اساساً با هدف تفسیرپذیری جهانی متفاوت است، جایی که ما می خواهیم تفسیری پیدا کنیم که برای مدل کامل و نه فقط برای یک نمونه خاص، وفاداری بالایی داشته باشد. طیف گسترده ای از تحقیقات فعال در زمینه تفسیرپذیری جهانی وجود دارد. بخشی از تحقیقات جهانی تفسیرپذیری بر کشف تأثیر یک ویژگی یا مجموعه‌ای از ویژگی‌ها بر پیش‌بینی‌های مدل متمرکز است، به‌عنوان مثال، در نمودارهای وابستگی جزئی (PDP)، تعامل ویژگی، یا اهمیت ویژگی (جایگشت). حوزه دیگری از تفسیرپذیری جهانی مربوط به یافتن نقاط داده ای است که نماینده مدل آموخته شده به منظور افزایش قابلیت تفسیر هستند، به عنوان مثال، نمونه های اولیه و انتقادات. در حالی که رویکردهای ذکر شده قبلی شبیه به I-Nets به عنوان روش‌های تفسیرپذیری جهانی هستند که می‌توانند بعداً اعمال شوند، آنها در نتایج روش تفسیر تفاوت قابل‌توجهی دارند. از آنجایی که یک مدل ذاتی قابل تفسیر را در نتیجه تفسیر ارائه می دهد، مرتبط ترین کار برای مقاله ما کار بر روی مدل های جانشین جهانی است. مدل‌های جانشین جهانی، طبق گفته مولنار، به عنوان یک مدل قابل تفسیر که برای تقریب پیش‌بینی‌های یک مدل جعبه سیاه آموزش داده شده است، تعریف می‌شوند. بنابراین، تفسیرپذیری با بازرسی پارامترهای مدل جایگزین به دست می‌آید. در ادبیات، مدل‌های جایگزین به‌عنوان مدل-آگنوستیک در نظر گرفته می‌شوند و بر اساس پیش‌بینی مدلی که می‌خواهیم تفسیر کنیم، آموزش داده می‌شوند. رویکردهای موجود معمولاً فقط در نوع مدل جایگزین انتخاب شده و روش آموزش مدل متفاوت است. انواع مختلفی از مدل‌های جایگزین مورد بررسی قرار گرفته‌اند، از جمله درخت‌های تصمیم‌گیری توابع ریاضی یا مجموعه قوانین. در حالی که نتیجه تفسیر (یعنی یک مدل ذاتی قابل تفسیر) مدل‌های جانشین جهانی با رویکرد ما مطابقت دارد، تفاوت‌های عمده‌ای نیز وجود دارد. همه رویکردهای ذکر شده نیاز به یک فرآیند بهینه سازی در طول تفسیر دارند، در حالی که رویکرد ما وظیفه تفسیر را به یک مشکل یادگیری ماشین تبدیل می کند که با استفاده از شبکه های عصبی از قبل حل می شود. علاوه بر این، رویکردهای موجود، تفسیرهایی را بر اساس نمونه‌هایی از مدلی که باید تفسیر شود، تولید می‌کنند و آنها را مدل تشخیصی می‌سازد. در مقابل، I-Net از پارامترهای شبکه شبکه λ به عنوان مبنایی برای تولید توضیحات استفاده می کند. بنابراین، این رویکرد با تعریف مدل خاص است. به طور خلاصه، برخلاف رویکردهای موجود، I-Nets تفسیرهای بی‌درنگ مدل‌های قبلاً دیده نشده و آموزش‌دیده را با نمایش عملکرد شبکه‌شان بدون تکیه بر هیچ داده‌ای امکان‌پذیر می‌سازد.