Morten Nielsen, CBS, BioCentrum, DTU CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Neural Network training.
Download ReportTranscript Morten Nielsen, CBS, BioCentrum, DTU CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Neural Network training.
Morten Nielsen, CBS, BioCentrum, DTU CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Neural Network training • How – Classification neural network • Howlin – Real value neural network • Nnlinplayer – Neural network player i.e. no training CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Neural network programs • How and howlin clumsy but very fast and efficient Fortran programs • Three important files – Parameter file; howlin.dat – Data file – Synaps (weight) file CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU How2doit • Format of output file • Plotting training and test performance – howlinplot fileout CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Output • Neural networks can learn higher order correlations! – What does this mean? 0 0 => 0 0 1 => 1 1 0 => 1 1 1 => 0 No linear function can learn this pattern CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Neural networks w11 w12 w21 v1 w22 v2 w11=1, w12=-1 w21=1 w22=-1 V1 = 0.5 v2= -0.5 CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Neural networks • Use weight file(s) to generate neural network predictions • Format – nnlinplayer synapsfilelist inputfile • Makes consensus prediction over N neural networks • Input file must be generated separately – seq2inp data • Using pipes – seq2inp data | nnlinplayet synlist -- CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU nnlinplayer • Classification network • Generates input data directly from sequence – RIISSIEQKEENKGGEDKLKMIREYRQMVE • Input is how files CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU how • • • • • • fasta2pep seq2inp ranlines splitfile balanceset xycorr • Examples fasta2pep ex.fsa | grep -v # | seq2inp -- | grep -v # | ranlines -- | grep -v # | splitfile -nc 4 -seq2inp data | nnlinplayer synlist -- | grep -v # | args 1,3 | xycorr CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Useful programs • • Copy all files from – /usr/opt/www/pub/CBS/researchgroups/immunology/intro/NeuralNet works/exercise/* to some directory Open the file doit – What does the program do? – Run the program and save the output to a file named datafile – Make a howlin neural network training • Set the number of hidden neurons in the howlin2002.dat file to 0 • Run the training typing – howlin2002 < howlin2002.dat > output • Plot the training/test performance using the howlinplot program • Redo the training using 2 hidden neurons – Check the synaps file. What are the weight values? • Do the prediction of T cell epitopes exercise www.cbs.dtu.dk/courses/27485.imm/exercise5/index.php CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Exercises