• No results found

MAY 2017 | ISSUE 277 http://www.linuxjournal.com

N/A
N/A
Protected

Academic year: 2022

Share "MAY 2017 | ISSUE 277 http://www.linuxjournal.com"

Copied!
121
0
0

Loading.... (view fulltext now)

Full text

(1)™. 3D. Since 1994: The Original Magazine of the Linux Community. Install a Network Monitor to Find Bandwidth Hogs Novelty Detection with Machine Learning. IMAGING of HEART ACTIVITY. with Open-Source Software. MAY 2017 | ISSUE 277 http://www.linuxjournal.com. EOF: Will Anything Make Linux Obsolete?. +. How-To: Build Your Own Cluster LJ277-May2017.indd 1. ISSUE OVERVIEW. V. for Beginners. WATCH:. 4/25/17 2:55 PM.

(2) Practical books for the most technical people on the planet.. GEEK GUIDES. Download books for free with a simple one-time registration. http://geekguide.linuxjournal.com. LJ277-May2017.indd 2. 4/25/17 2:55 PM.

(3) !. NEW. An Architect’s Guide: Linux for Enterprise IT. !. NEW. Memory: Past, Present and Future—and the Tools to Optimize It. Author: Sol Lederman. Author: Petros Koutoupis. Sponsor: SUSE. Sponsor: Intel. Cloud-Scale Automation with Puppet. Why Innovative App Developers Love High-Speed OSDBMS. Author: John S. Tonello Sponsor: Puppet. Tame the Docker Life Cycle with SUSE Author: John S. Tonello Sponsor: SUSE. Author: Ted Schmidt Sponsor: IBM. SUSE Enterprise Storage 4 Author: Ted Schmidt Sponsor: SUSE. BotFactory: Automating the End of Cloud Sprawl. Containers 101. Author: John S. Tonello. Sponsor: Puppet. Author: Sol Lederman. Sponsor: BotFactory.io. LJ277-May2017.indd 3. 4/25/17 2:55 PM.

(4) CONTENTS. MAY 2017 ISSUE 277. FEATURES 84 3D Imaging of. Heart Activity with OpenSource Software Open source is the way to go for any research project, but sometimes you have to draw closed source into the mix. Jacques de Hooge. 96 BYOC:. Build Your Own Cluster, Part I—Design Design a robust compute cluster from the ground up. Nathan R. Vance, Michael L. Poublon and William F. Polik. 4 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 4. 4/25/17 2:55 PM.

(5) CONTENTS. COLUMNS. 38 Reuven M. Lerner’s At the Forge Novelty and Outlier Detection. 46 Dave Taylor’s Work the Shell. 22. Working with YouTube and Extracting Audio. 54 Kyle Rankin’s Hack and / Sysadmin 101: Leveling Up. 64 Shawn Powers’ The Open-Source Classroom. 26. Tracking Down Blips. 112 Doc Searls’ EOF Will Anything Make Linux Obsolete?. IN EVERY ISSUE 8 10 18 36 76 120. Current_Issue.tar.gz Letters UPFRONT Editors’ Choice New Products Advertisers Index. 46 ON THE COVER UÊÎ

(6) Ê“>}ˆ˜}ʜvÊi>ÀÌÊV̈ۈÌÞÊÜˆÌ Ê"«i˜‡-œÕÀViÊ-œvÌÜ>Ài]Ê«°Ên{ Uʜ܇/œ\Ê Õˆ`Ê9œÕÀÊ"Ü˜Ê ÕÃÌiÀÊvœÀÊ i}ˆ˜˜iÀÃ]Ê«°Ê™È UʘÃÌ>Ê>Ê iÌܜÀŽÊœ˜ˆÌœÀÊ̜ʈ˜`Ê >˜`܈`Ì Êœ}Ã]Ê«°ÊÈ{ UÊ œÛiÌÞÊ

(7) iÌiV̈œ˜ÊÜˆÌ Ê>V ˆ˜iÊi>À˜ˆ˜}]Ê«°ÊÎn UÊ "\Ê7ˆÊ˜ÞÌ ˆ˜}Ê>Žiʈ˜ÕÝÊ"L܏iÌi¶]Ê«°Ê££Ó. LINUX JOURNAL (ISSN 1075-3583) is published monthly by Belltown Media, Inc., PO Box 980985, Houston, TX 77098 USA. Subscription rate is $29.50/year. Subscriptions start with the next issue.. 5 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 5. 4/25/17 2:55 PM.

(8) Executive Editor Senior Editor Associate Editor Art Director Products Editor Editor Emeritus Technical Editor Senior Columnist Security Editor Hack Editor Virtual Editor. Jill Franklin [email protected] Doc Searls [email protected] Shawn Powers [email protected] Garrick Antikajian [email protected] James Gray [email protected] Don Marti [email protected] Michael Baxter [email protected] Reuven Lerner [email protected] Mick Bauer [email protected] Kyle Rankin [email protected] Bill Childers [email protected]. Contributing Editors )BRAHIM (ADDAD s 2OBERT ,OVE s :ACK "ROWN s $AVE 0HILLIPS s -ARCO &IORETTI s ,UDOVIC -ARCOTTE 0AUL "ARRY s 0AUL -C+ENNEY s $AVE 4AYLOR s $IRK %LMENDORF s *USTIN 2YAN s !DAM -ONSEN. President. Carlie Fairchild [email protected]. Publisher. Mark Irgang [email protected]. Associate Publisher. John Grogan [email protected]. Director of Digital Experience Accountant. Katherine Druckman [email protected] Candy Beauchamp [email protected]. Linux Journal is published by, and is a registered trade name of, Belltown Media, Inc. 0/ "OX 

(9) (OUSTON

(10) 48  53! Editorial Advisory Panel Nick Baronian Kalyana Krishna Chadalavada "RIAN #ONNER s +EIR $AVIS -ICHAEL %AGER s 6ICTOR 'REGORIO $AVID ! ,ANE s 3TEVE -ARQUEZ $AVE -C!LLISTER s 4HOMAS 1UINLAN #HRIS $ 3TARK s 0ATRICK 3WARTZ Advertising % -!),: [email protected] 52,: www.linuxjournal.com/advertising 0(/.%     EXT  Subscriptions % -!),: [email protected] 52,: www.linuxjournal.com/subscribe -!), 0/ "OX 

(11) (OUSTON

(12) 48  53! LINUX IS A REGISTERED TRADEMARK OF ,INUS 4ORVALDS. LJ277-May2017.indd 6. 4/25/17 2:55 PM.

(13) You cannot keep up with data explosion.. Manage data expansion with SUSE Enterprise Storage. SUSE Enterprise Storage, the leading open source storage solution, is highly scalable and resilient, enabling high-end functionality at a fraction of the cost. suse.com/storage. Data. LJ277-May2017.indd 7. 4/25/17 2:55 PM.

(14) Current_Issue.tar.gz. Doing Big Things. I. SHAWN POWERS Shawn Powers is the Associate Editor for Linux Journal. He’s also the Gadget Guy for LinuxJournal.com, and he has an interesting collection of vintage Garfield coffee mugs. Don’t let his silly hairdo fool you, he’s a pretty ordinary guy and can be reached via email at [email protected]. Or, swing by the #linuxjournal IRC channel on Freenode.net.. V. BOUGHT A BOOK A FEW YEARS BACK TITLED

(15) Installing Linux on a Dead Badger by Lucy Snyder. When I SEE THAT BOOK ON MY BOOKSHELF

(16) IT STILL MAKES ME chuckle. And although a dead badger is certainly not a COMMON OPERATING SYSTEM PLATFORM

(17) IT SEEMS LIKE ,INUX continues to end up on more and more hardware every YEAR /FTEN THOSE DEVICES ARE TINY

(18) BUT SOMETIMES THOSE DEVICES ARE GIANT CLUSTERS OF COMPUTERS THAT CRUNCH TRILLIONS OF NUMBERS A SECOND !ND OF COURSE

(19) SOME OF THE TINIEST INSTALLATIONS END UP HAVING THE BIGGEST EFFECTS SMARTPHONES

(20) FOR INSTANCE  4HIS MONTHS ISSUE IS A REMINDER OF JUST HOW WELL OUR FAVORITE OPERATING SYSTEM HAS INFILTRATED OUR LIVES We start with Reuven M. Lerner continuing his THEME OF COMPUTER BASED LEARNING -ANY OF YOU probably remember the Sesame Street game Grover PLAYED WITH VIEWERS CALLED h/NE OF 4HESE 4HINGS Is Not Like the Others”. In a similar vein, Reuven SHOWS HOW YOU CAN TEACH A BIT OF SOFTWARE HOW TO PLAY THAT GAME AS IT PERTAINS TO SETS OF DATA 2ATHER THAN JUST CHOPPING OFF THE HIGHS AND LOWS IN A SET OF DATA

(21) COMPUTER LEARNING CAN DETERMINE WHAT DATA IS ACTUALLY OUT OF PLACE $AVE 4AYLOR FOLLOWS WITH SOME REALLY AWESOME SCRIPTING )F YOUVE EVER SPENT TIME MINDLESSLY BROWSING 9OU4UBE

(22) YOUVE PROBABLY LIKED A VIDEO SO MUCH AT ONE POINT THAT YOU WANTED A LOCAL COPY /R

(23) PERHAPS YOUVE. VIDEO:. Shawn Powers runs through the latest issue.. 8 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 8. 4/25/17 2:56 PM.

(24) Current_Issue.tar.gz WISHED YOU COULD STRIP THE AUDIO OFF A 9OU4UBE VIDEO AND PLAY IT ON YOUR SOUND SYSTEM $AVE DESCRIBES A REALLY COOL TOOL FOR EXTRACTING AUDIO AND VIDEO FROM 9OU4UBE 52,S 'OOGLE MIGHT NOT BE THRILLED AT THE IDEA OF DOWNLOADING COPIES OF VIDEOS

(25) BUT ONCE AGAIN

(26) ,INUX SAVES THE DAY 3PEAKING OF SAVING THE DAY

(27) +YLE 2ANKIN CONTINUES HIS SERIES ON SYSTEMS ADMINISTRATION 4HIS MONTH

(28) HE HELPS DEFINE THE VARIOUS LEVELS AND TITLES OF ADMINISTRATIVE ABILITIES %VER WONDER IF YOU QUALIFY AS A JUNIOR SYSTEMS administrator or a senior systems administrator? Kyle helps explain WHAT THE VARIOUS TITLES MEAN

(29) WHICH IS INVALUABLE IF YOURE APPLYING FOR A JOB OR PLANNING TO HIRE MORE HELP %VEN IF YOUR INTERESTS DONT LIE IN ADMINISTRATION

(30) THE INFORMATION IS CRITICAL FOR ANYONE IN )4 TO UNDERSTAND ) DID A LITTLE BIT OF INVESTIGATING THIS MONTH 3PECIFICALLY

(31) ) INVESTIGATED MY NETWORK TRYING TO TRACK DOWN SOME ROGUE TRAFFIC ON MY ROUTER ) LEARNED A LOT ALONG THE WAY

(32) AND IN THE END

(33) ) HAD A FACE PALM MOMENT .EVERTHELESS

(34) ) FIGURED ALL THE NETWORK MONITORING AND INVESTIGATIVE KNOWLEDGE MADE IT WORTHWHILE )F YOURE UNCERTAIN WHAT IS HAPPENING ON your local network, I encourage you to read my column. *ACQUES $E (OOGE DESCRIBES ANOTHER WAY ,INUX AND OPEN SOURCE CAN HELP YOU DETERMINE WHAT IS GOING ON

(35) SPECIFICALLY WHATS GOING ON IN YOUR CHEST 5SING VARIOUS OPEN SOURCE TOOLS

(36) ALONG WITH A FEW PROPRIETARY ONES

(37) *ACQUES TELLS HOW ,INUX HELPS CREATE $ MODELS OF THE HEART FOR DIAGNOSES )TS A PERFECT EXAMPLE OF ,INUX QUIETLY MAKING THE WORLD GO ROUND

(38) AND ITS ALSO REALLY COOL TO LEARN ABOUT 7E FINISH OFF THE ISSUE WITH THE FIRST PART OF A SERIES ON BUILDING A COMPUTER CLUSTER USING ,INUX .ATHAN 2 6ANCE

(39) -ICHAEL , 0OUBLON AND 7ILLIAM & 0OLIK JOIN FORCES TO TEACH HOW TO CREATE AN APPROPRIATE CLUSTER FROM BEGINNING TO END 7HETHER YOU WANT TO BUILD YOUR OWN CLUSTER

(40) OR JUST WANT TO LEARN ABOUT THE TECHNOLOGY

(41) ITS AN AWESOME SERIES THAT WERE STARTING THIS MONTH 7E ALSO HAVE THE NORMAL COLLECTION OF Linux Journal goodies, including TECH TIPS

(42) NEW PRODUCT ANNOUNCEMENTS AND 5P&RONT ODDITIES FROM AROUND THE INTERNET 7HETHER YOU LOVE ,INUX BECAUSE OF ALL THE TINY PLACES IN CAN BE INSTALLED OR THINK ITS AWESOME THAT ,INUX POWERS THE INTERNET

(43) THIS ISSUE SHOULD TICKLE YOUR FANCY ALL THE SAME Q. RETURN TO CONTENTS 9 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 9. 4/25/17 2:56 PM.

(44) LETTERS. LETTERS PREVIOUS Current_Issue.tar.gz. ]. NEXT UpFront. V. V. [. EOF January 2017 I have been an avid and loyal Linux Journal SUBSCRIBER FOR  YEARS "UT AFTER READING $OC 3EARLS *ANUARY  %/&

(45) ) AM THOROUGHLY disgusted. I will not be renewing. Politics has no place in a magazine such as LJ )F ) WANTED SOMEONES POLITICAL OPINION

(46) )D TURN ON -3."# OR #.. 4HE ARROGANCE DISPLAYED BY PUBLISHING THAT ARTICLE IS ASTOUNDING —Doug McComber. Doc Searls replies: Thanks for writing. And for debugging mine. Under similar criticisms in the web version of the column, I wrote this (HTTPWWWLINUXJOURNALCOMCONTENTDEBUGGING DEMOCRACYCOMMENT ): fulp01 isn’t a troll. He’s a subscriber, and we value those. He’s also right to call me to task. Judging from the almost entirely negative response this column has received so far, I was the one doing the trolling, though that wasn’t my intent. Calling Trump a troll also distracted readers from my main point, which is that journalism is suffering in a world where a business based on surveillance is programmatically dividing people into mutually hostile echo chambers, which makes democracy suffer as well. And that, because this whole echo-system is programmatic, and to a high degree runs on Linux-based infrastructure, we (or at least some of us) are in a position to help fix it. 10 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 10. 4/25/17 2:56 PM.

(47) LETTERS. I also wrote a similar response for the magazine, and something like it in my March 2017 column. Hope those help, and that you stay with us.. Open-Source Classroom—Passwords, Security Questions 2EGARDING 3HAWN 0OWERS h!LL 9OUR !CCOUNTS !RE "ELONG TO 5Sv ARTICLE IN THE &EBRUARY  ISSUE GOOD INFORMATION REGARDING PASSWORDS AND ND FACTOR AUTHENTICATION

(48) ETC /NE THING )VE DONE FOR THE LAST TEN YEARS OR SO IS CREATE FICTITIOUS ALTERNATIVE PERSONAL DATA THAT IS USED FOR ONLINE ACCOUNTSˆ THINGS LIKE BIRTHDAY

(49) MOTHERS MAIDEN NAME

(50) HIGH SCHOOL TEACHER

(51) FIRST CAR AND SO ON ) ONLY USE MY REAL PERSONAL DATA WHEN IT MUST BE DONE

(52) LIKE FOR BANKING OR GOVERNMENT STUFF ) STORE MY ALTERNATIVE PERSONAL INFORMATION IN MY PASSWORD MANAGER SO ) DONT HAVE TO REMEMBER IT 4HIS WAY

(53) IF ANY OF MY social media or other online accounts ever do have A BREACH

(54) THE DATA THAT IS LEAKED CANT BE USED AS IDENTIFICATION VERIFICATION BY A BAD ACTOR OR AS A PIVOT POINT TO GET INTO OTHER ACCOUNTS 4HE KEY IS BEING consistent. Regarding password managers, since I DONT TRUST ANY THIRD PARTY

(55) MY PASSWORD MANAGER IS A STRONGLY ENCRYPTED TEXT FILE THAT IS SYNCED VIA Dropbox with a very strong high entropy password THAT )VE COMMITTED TO MEMORY /NE MORE THING

(56) REGARDING USING FINGERPRINTS AND OTHER BIOMETRICS to log on, keep in mind that you can be compelled TO PLACE YOUR FINGER OR HAVE RETINA SCAN TO UNLOCK YOUR DEVICE

(57) BUT COURTS SO FAR HAVE UPHELD THE DIVULGING OF PASSWORDS AS hSOMETHING YOU KNOWv and hence is under 5th Amendment protection. —Mark Dean. At Your Service SUBSCRIPTIONS: Linux Journal is available in a variety of digital formats, including PDF, .epub, .mobi and an online digital edition, as well as apps for iOS and Android devices. Renewing your subscription, changing your email address for issue delivery, paying your invoice, viewing your account details or other subscription inquiries can be done instantly online: http://www.linuxjournal.com/subs. Email us at [email protected] or reach us via postal mail at Linux Journal, PO Box 980985, Houston, TX 77098 USA. Please remember to include your complete name and address when contacting us. ACCESSING THE DIGITAL ARCHIVE: Your monthly download notifications will have links to the various formats and to the digital archive. To access the digital archive at any time, log in at http://www.linuxjournal.com/digital. LETTERS TO THE EDITOR: We welcome your letters and encourage you to submit them at http://www.linuxjournal.com/contact or mail them to Linux Journal, PO Box 980985, Houston, TX 77098 USA. Letters may be edited for space and clarity. WRITING FOR US: We always are looking for contributed articles, tutorials and real-world stories for the magazine. An author’s guide, a list of topics and due dates can be found online: http://www.linuxjournal.com/author. FREE e-NEWSLETTERS: Linux Journal editors publish newsletters on both a weekly and monthly basis. Receive late-breaking news, technical tips and tricks, an inside look at upcoming issues and links to in-depth stories featured on http://www.linuxjournal.com. Subscribe for free today: http://www.linuxjournal.com/ enewsletters. ADVERTISING: Linux Journal is a great resource for readers and advertisers alike. Request a media kit, view our current editorial calendar and advertising due dates, or learn more about other advertising and marketing opportunities by visiting us on-line: http://ww.linuxjournal.com/ advertising. Contact us directly for further information: [email protected] or +1 713-344-1956 ext. 2.. 11 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 11. 4/25/17 2:56 PM.

(58) LETTERS Shawn Powers replies: That’s all really good information, and I’ve considered the alternate persona thing as well. Having security questions for password recovery just seems like a bad idea, since most of the questions are fairly easy to figure out, especially if you know the person. My bank, for example, asks, “Where were you born?”, “Who’s your favorite singer?” and “Who’s your favorite author”, which are all things widely known to anyone who knows me even online. You’re also correct about the biometrics. They’re not a great way to secure a phone, but they do offer a convenience factor that in some cases I deem worth the downside. Thankfully, some apps require multiple authentication factors. Copay, my Bitcoin wallet, for instance, can be forced to require a password and biometrics. Anyway, I think the best thing we can do as an IT community is make sure we’re educating folks who might not understand the significance of securing their accounts and data. Often just understanding is enough to get people to make better choices.. Politics Don’t Belong in Technical Publications I respect that this is your publication, but as a customer, I wanted to let you know my opinion and intent should the political rants continue. I don’t read this publication to learn about opinions of Trump, the electoral college or the Clintons. I subscribe and pay for Linux-related news and information. Simply put, if the political overtones continue, I’ll save myself the subscription cost, browse to Fox news or CNN, and not read this publication in the future. Thank you for your consideration, and I hope to see less word count about politics and more interesting content about Linux. —G. Powers. Doc Searls replies: Grant, I assume you are writing in response to my “Debugging Democracy” column in the January 2017 issue—the one and only time in two decades of writing for Linux Journal that I’ve ever brought up politics (or at least that I remember). So you know, I’ve already responded to similar pushback from other 12 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 12. 4/25/17 7:19 PM.

(59) LETTERS readers. Here is what I wrote in response to a reader named Mark: Mark is right. I do owe readers an apology. By calling Donald Trump a troll (take a look at the Wikipedia definition of Internet troll (https://en.wikipedia.org/wiki/Internet_troll), and draw your own conclusions), I was being a troll as well. Even though trolling wasn’t my intent, that has been the effect so far: every response to my January column, both here and on our website, has been as negative as Mark’s, and for the same reasons. Opening with that remark also failed to support the main purpose of that column, which was to call for help in rescuing journalism—and real journals such as this one—from drowning in a sea of “content”, way too much of which is crap routed by algorithms aimed by surveillance-gathered data into echo chambers of the like-minded. This has the effect of increasing enmity and blame toward those in echo chambers with opposing sympathies, which is worse than dangerous in democratic societies, because it tears apart the center spaces of basic agreement those societies require. You can see how this looks in 4HE 7ALL 3TREET *OURNAL’s Blue Feed, Red Feed site (http://graphics.wsj.com/blue-feed-red-feed), subtitled “See Liberal Facebook and Conservative Facebook, Side by Side”. I am sure most of the systems driving us into hostile camps are built on Linux. (Isn’t everything now?) So I don’t think I’m off base calling for help here.. I believe that was published (with Mark’s letter) in the March 2017 issue. I hope this addresses your concerns. If not, let us know.. Doc Searls’ Columns ) HOPE THIS NOTE FINDS YOU WELL AND ENJOYING THE RICHES OF ALL THINGS ,INUX 4HIS IS JUST A SHORT NOTE OF APPRECIATION FOR $OC 3EARLS AND HIS monthly Linux Journal columns. I suspect some readers are occasionally PUZZLED ABOUT THE NATURE OF THOSE COLUMNSˆTHEYRE NOT EXACTLY 13 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 13. 4/25/17 2:56 PM.

(60) LETTERS TECHNICAL hHOW TOv SORTS OF THINGS ) THINK ITS IMPORTANT TO KEEP IN MIND THE GLOBAL PICTURE

(61) AND )M GLAD that LJ SEES FIT TO GIVE VOICE TO SUCH IMPORTANT ISSUES )T WOULD BE SIMPLE TO TAKE THE LOW ROAD AND JUST KEEP TO THE GEEK STUFF 4HANKS FOR CONTINUING TO TAKE THE HIGH ROAD —David Klann. Linux Desktop Use Case (from De Bortoli Wines, Australia) 2EACHING OUT TO $OC 3EARLS AS A RECENT ARTICLE OF HIS LAMENTED THE STATE OF THE ,INUX $ESKTOP ) THOUGHT YOU MIGHT BE INTERESTED IN CONNECTING WITH A COMPANY WITH ,INUX AS THE DEFAULT DESKTOPˆSINCE  . 7ARNINGˆSHAMELESS PROPAGANDA AHEAD.  &ROM  HTTPWWWCOMPUTERWORLDCOMAUARTICLEDE?BORTOLI? WINES?GETS?TASTE?LINUX and HTTPWWWGOOGLECOMAUSEARCHHLENQD E BORTOLI OPEN STANDARDS OR OPEN SOURCE OR ,INUX. —Bill. Doc Searls and Content 2EGARDING $OC 3EARLS h4HE 0ROBLEM WITH #ONTENTv IN THE -ARCH  ISSUE NICE EDITORIAL

(62) BUT ) THINK YOU ARE MISSING ANOTHER FACET OF WHAT HAPPENED TO THE hMEDIAv OUT THEREˆNAMELY THAT OF PARTIAL NEWS ) USED TO READ A LOT OF NEWSPAPERS WHEN ) WAS YOUNGER /VER TIME

(63) IT BECAME IMPOSSIBLE TO IGNORE A CERTAIN BIAS IN MOST MEDIA OUTLETS REPORTING !ND AS TIME WENT ON

(64) IT BECAME MORE AND MORE DIFFICULT TO keep paying FOR SUCH REPORTING /NE NEEDS TO LOOK NO FURTHER THAN THE HYPERVENTILATING MELTDOWN THAT THE hMEDIAv HAS SUFFERED UNDER 0RESIDENT 4RUMP TO UNDERSTAND THAT REPORTERS HAVE INTENTIONALLY LEFT ABOUT HALF OF THE POPULATION BEHIND #ERTAINLY OTHER FACTORS ARE AT PLAY

(65) BUT )M CERTAIN THINGS WOULDNT BE QUITE AS BAD AS THEY ARE IF THE hMEDIAv HAD NOT ALIENATED HALF OF ITS POTENTIAL READERSHIP —Aki Korhonen 14 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 14. 4/25/17 2:56 PM.

(66) LETTERS Doc Searls replies: Thanks, Aki. Good points. I think all news is partial, in at least two meanings of the word: it’s both incomplete and biased in some way. I also think the internet has utterly changed all the old media outlets by supporting countless new ones, while social media on the net has driven coverage and conversation into echosystems that not only don’t talk with each other, but distrust and dislike each other more and more, as they get fed one-sided “content”, because that’s what algorithms send to them. As I mentioned previously, to see this at work, check out 4HE 7ALL 3TREET *OURNAL’s “Blue Feed/Red Feed”: HTTPGRAPHICSWSJCOMBLUE FEED RED FEED. Kinda scary. In the old media world, there was common ground. In the new one, it’s pretty much gone. I agree that big-name big-city papers and broadcasters ignored much of the country, roughly since Bush the Elder, which is why many people in. Archive 1994–2016 NOW AVAILABLE! SAVE $10.00 by using discount code 2016ARCH at checkout. Coupon code expires 5/28/2017. www.linuxjournal.com/archive 15 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 15. 4/25/17 2:56 PM.

(67) LETTERS those places feel left behind. But other media came in and served those people, who not only just elected a president they prefer, but also gave the GOP a majority in both houses of Congress. Hit SCAN on your radio, even in major markets, and most of the non-sports talk you’ll hear is hard right. When I’m in red states, it seems like every establishment that has TVs for patrons (even hospital waiting rooms) is playing Fox News. At this point, however, both of the old media sides are in trouble, because the internet isn’t just changing every media game; it’s inviting many new ones, most of which we haven’t seen yet.. Shotcut )D LOVE FOR YOU TO REVIEW 3HOTCUT VIDEO EDITOR ) RECENTLY USED IT TO EDIT ^ HOURS OF OLD 6(3 TAPES THAT WERE DIGITALLY CAPTURED INTO ABOUT TEN HOURS TO PLAY IN A LOOP ON AN 20I RUNNING +ODI AS BACKGROUNDICEBREAKER FOR A TH REUNION PARTY )T GOT PEOPLE TALKING AND REMINISCING

(68) AND WAS a great success. ) FOUND PARTS OF THE 5) CLUNKY

(69) AND SOME OF THE DEFAULT TIMELINE BEHAVIORS WHEN CUTTING WERE LESS THAN HELPFUL

(70) BUT IT NEVER CRASHED )T DID LOCK UP ONCE WHILE RENDERING FOR SOME MYSTERIOUS REASON

(71) BUT ) JUST KILLED IT AND RESTARTED IT WITH THE SAVED XML PROJECT FILE ) WAS IMPRESSED 4HE BEST THING IS ITS TOTALLY SELF CONTAINEDˆJUST UNZIP THE ARCHIVE

(72) RUN THE SUPPLIED STARTUP SCRIPT

(73) AND OFF YOU GO .O DEPENDENCY HEADACHES !LL ,INUX SOFTWARE SHOULD ASPIRE TO THIS —Walter B. Kulecz. Shawn Powers replies: Thanks for the heads up. I’ve never tried Shotcut, but I’ll give it a whirl. I’ll try to write up a review as well, depending on how my experience goes.. New F150! 4O 3HAWN 0OWERS #AN YOU POST SOME PHOTOS OF YOUR NEW & RIG IN THE NEXT ISSUE OF LJ *UST CHECKING IF YOURE MAKING THAT ONE UP —David 16 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 16. 4/25/17 2:56 PM.

(74) LETTERS Shawn Powers replies: David, not only do I really have an F-150, but check out the license plate!. Shawn’s F150—check out the license plate!. WRITE LJ A LETTER We love hearing from our readers. Please send us your comments and feedback via http://www.linuxjournal.com/contact.. PHOTOS Send your Linux-related photos to [email protected], and we'll publish the best ones here.. RETURN TO CONTENTS 17 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 17. 4/25/17 2:56 PM.

(75) UPFRONT. UPFRONT PREVIOUS Letters. NEXT Editors’ Choice. V. V. NEWS + FUN. diff -u. 7 >̽ÃÊ iÜʈ˜ÊÊ iÀ˜iÊ

(76) iÛiœ«“i˜Ì &IRMWARE SUPPORT HAS BECOME MORE AND MORE DIFFICULT TO MAINTAIN OVER TIME

(77) ESPECIALLY AS MORE AND MORE FEATURES HAVE BEEN ADDED 3OME FEATURES ARENT EVEN ABOUT LOADING FIRMWARE SO MUCH AS JUST DOING SOMETHING THATS MORE EASILY DONE AT THE SAME TIME AS LOADING THE FIRMWARE !ND

(78) WHENEVER THE FIRMWARE !0) GETS UPDATED

(79) THE patch has to include updates to all user code that uses that particular PROGRAMMER INTERFACE /VER TIME

(80) THIS TENDS TO MAKE THE PATCHES BIGGER AND MORE ERROR PRONE OVERALL Luis R. Rodriguez RECENTLY PROPOSED A NEW FIRMWARE !0)

(81) NOT QUITE A TOTAL REPLACEMENT OF THE EXISTING CODE

(82) BUT SOMETHING THAT WOULD AT least make more sense and tolerate updates more easily. At the same TIME

(83) THE NEW CODE WOULD LEAVE OPEN THE QUESTION OF CERTAIN THORNY PROBLEMS

(84) SUCH AS WHAT TO DO WHEN A PARTICULAR PIECE OF FIRMWARE DOESNT WORK 7HATS THE FALLBACK PROCEDURE &OR THIS

(85) HE DESCRIBED THE EXISTING CODE AS hHAIRYv AND DIDNT WANT TO TOUCH IT UNTIL VARIOUS OTHER ISSUES COULD BE RESOLVED &OR EXAMPLE

(86) HE SAID THAT THE KERNELS init code contained race conditions that would have an impact on any ATTEMPT TO FIX UP THE FIRMWARE FALLBACK IMPLEMENTATION

(87) SO THE ONE WOULD HAVE TO WAIT FOR THE OTHER 18 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 18. 4/25/17 2:56 PM.

(88) UPFRONT. 6ARIOUS FOLKS LIKE Greg Kroah-Hartman and Bjorn Andersson had SUGGESTIONS AND OBJECTIONS )N PARTICULAR

(89) "JORN WANTED THE OLD FIRMWARE !0) TO GO AWAY AT SOME POINT AND BE FULLY REPLACED BY THE NEW INTERFACE "UT

(90) ,UIS SAID THE TWO WOULD HAVE TO COEXIST FOR THE FORESEEABLE FUTURE

(91) ALTHOUGH HE DID ADD THAT THE OLD INTERFACE WOULD BECOME STATIC

(92) AND ALL NEW FIXES AND UPDATES WOULD GO INTO THE NEW !0) (ARDWARE ACCELERATION INVOLVES PERFORMING CERTAIN WORK IN HARDWARE THAT WAS SPECIFICALLY BUILT FOR THAT PURPOSE

(93) AS OPPOSED TO DOING THE SAME WORK USING THE STANDARD OPCODES AVAILABLE ON A GENERAL PURPOSE #05 )N TERMS OF EFFICIENCY

(94) ALL ELSE BEING EQUAL

(95) SPECIALIZED HARDWARE BEATS THE PANTS OFF GENERAL PURPOSE #05S Binoy Jayan RECENTLY WANTED TO MIGRATE SOME OF THE KERNELS crypto code INTO HARDWARE TO TAKE ADVANTAGE OF THAT SPEEDUP OPPORTUNITYˆSPECIFICALLY THE initialization vector )6 ROUTINES IN the dm-crypt.c FILE "UT

(96) Milan Broz warned against moving the code OUT OF DM CRYPTC

(97) BECAUSE IT WOULD MAKE IT HARDER FOR THE CRYPTO TEAM TO MODIFY THE KEY DATA STRUCTURES IN THE FUTURE

(98) IF THEY SO DESIRED !LSO

(99) HE SAID

(100) SOME OF THE )6 GENERATOR CODE WAS HACKY AND RISKY

(101) AND IT SHOULDNT BE CONSIDERED GOOD ENOUGH TO MIGRATE INTO HARDWARE 5LTIMATELY

(102) "INOYS CODE BECAME MORE AND MORE CONTROVERSIAL

(103) AS FOLKS like Ondrej Mosnáçek PROPOSED COMPLETELY DIFFERENT SOLUTIONS TO THE problems Binoy wanted to address. "Y THE END OF THE DISCUSSION

(104) HARDWARE ACCELERATION REMAINED AN OPTION FOR THE CRYPTO )6 ROUTINES

(105) BUT THERE STILL WAS NO AGREEMENT ON the exact implementation. 4HE QUEST TO ACCESS MORE AND MORE MEMORY IS ONGOING Nikita Yushchenko recently pointed out that while PCI devices potentially COULD SUPPORT UP TO  BIT $-! DIRECT MEMORY ACCESS ADDRESSING

(106) SOME OF THE 0#) CODE

(107) SUCH AS HOST BRIDGE

(108) HAD SOFTWARE LIMITATIONS THAT PREVENTED IT .IKITA WANTED AT LEAST TO PREVENT 0#) DEVICES FROM CLAIMING THE ABILITY TO ACCESS THAT MUCH MEMORY

(109) IF IT COULDNT IN REALITY $URING THE COURSE OF DISCUSSION

(110) HOWEVER

(111) AND PARTICULARLY WITH Arnd Bergmann

(112) WHOD WRITTEN HIS OWN PATCH TO ADDRESS THE ISSUE IN A DIFFERENT WAY

(113) IT TURNED OUT THAT .IKITA WASNT ENTIRELY SURE WHERE THE 2!- ACCESS LIMITATIONS REALLY WERE )T ENDED UP BEING A THORNY QUESTION 19 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 19. 4/25/17 2:56 PM.

(114) UPFRONT. !RND AND .IKITA PURSUED THE PROBLEM TOGETHER

(115) EACH CURSING LOUDLY AND LOUDLY AGREEING WITH EACH OTHER OVER THE HORRIBLENESS OF THE !0) 4HE DISCUSSION ENDED WITH ONLY AN INCOMPLETE UNDERSTANDING OF THE PROBLEM

(116) BUT AT LEAST THE QUESTION HAD BEEN IDENTIFIED 4HE ISSUE OF HOW BEST TO ALLOW 0#) DEVICES TO ACCESS  BIT $-! ADDRESSES remains open. 4HE KERNEL BOOT PROCESS IS ONE OF THE SCARIEST PARTS OF THE WHOLE KERNEL 4RYING TO SUPPORT EVERY #05 EVER MADE

(117) INCLUDING THOSE WITH HARDWARE ERRORS

(118) MIS FEATURES AND VARIOUS OTHER DESIGN FLAWS

(119) IS QUITE SIMPLY INSANE )T SHOULD BE NO SURPRISE THAT EFFORTS TO IMPROVE THE BOOT process tend to be highly controversial. 4RYING TO SUPPORT THE multiboot specification

(120) FOR EXAMPLE

(121) TURNS OUT TO HAVE ALL KINDS OF PITFALLS Chao Peng tried to do this recently, and H. Peter Anvin OFFERED STRENUOUS OBJECTION (E SAID -ULTIBOOT HAS A FUNDAMENTALLY BROKEN ASSUMPTION

(122) WHICH IS TO DO CERTAIN WORK FOR THE KERNEL IN THE BOOTLOADER 4HIS IS FUNDAMENTALLY a bad idea, because you always want to do things in the latest step possible during the boot process, being the most upgradeable, and HAVE THE INTERFACE AS NARROW AS POSSIBLE 4HEREFORE

(123) USING -ULTIBOOT is actively a negative step. It is declared an “Open Standard” but anything can be such declared; it really is a claim that “everything should work like Grub.”. 4HE DEBATE WAS NOT RESOLVED DURING THIS EMAIL THREAD

(124) BUT TYPICALLY THE BOOT SPECIFICATION WOULD NEED TO ADDRESS THE KERNEL FOLKS OBJECTIONS BEFORE ANY CODE WOULD BE ACCEPTED —Zack Brown. 20 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 20. 4/25/17 2:56 PM.

(125) THE ESSENTIAL EVENT FOR THE COMMERCIAL DRONE INDUSTRY. JULY 19-21, 2017. OREGON CONVENTION CENTER, PORTLAND, OREGON. REGISTER NOW! Linux Journal readers save $50 on a full conference pass. FEATURED SPEAKERS. GRETCHEN WEST. JONATHAN EVANS. COLIN SNOW. SHARON ROSSMARK. Hogan Lovells. Skyward. Discover cutting-edge commercial drone software and technology. Session topics include: LiDAR mapping software. Skylogic Research. AeroVista Innovations. Advanced image processing Thermal and multi-spectral imaging. Premier Sponsor:. Contributing Sponsor:. Powering the commercial drone super-highway TM. P30646. Use coupon code linuxjournal to save $50 off a full conference pass.. Flying is just the beginning. ASCEND-EVENT.COM. LJ277-May2017.indd 21. 4/25/17 2:56 PM.

(126) UPFRONT. Spend Bitcoin Anywhere )VE WRITTEN ABOUT "ITCOIN SEVERAL TIMES DURING THE PAST FEW YEARS

(127) AND ) STILL LOVE THE TECHNOLOGY ) AM A LITTLE DISTURBED BY THE AMOUNT OF electricity the Bitcoin blockchain consumes using dirty power sources, BUT THATS ANOTHER DISCUSSION ALTOGETHER !LTHOUGH THERE ARE MANY places to spend Bitcoin directly, and services like Purse.io exist that ALLOW YOU TO SPEND "ITCOIN AT !MAZON

(128) WHAT IF YOU WANT TO BUY A PACK OF GUM AT THE LOCAL GAS STATION ) RECENTLY ORDERED TWO DIFFERENT "ITCOIN DEBIT CARDS /NE CARD IS FROM "IT0AY https://bitpay.com/card

(129) AND ONE IS FROM 3HIFT HTTPSWWWSHIFTPAYMENTSCOMCARD  4HEY BOTH CONCEPTUALLY DO the same thing, which is convert your Bitcoin into currency that CAN BE SPENT ANYPLACE THAT ACCEPTS DEBIT CARDS 4HEY WORK SLIGHTLY DIFFERENTLY IN FUNCTION THOUGH 22 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 22. 4/25/17 2:56 PM.

(130) UPFRONT. 4HE "IT0AY CARD IS A hRELOADABLEv DEBIT CARD THAT ALLOWS YOU TO ADD 53 $OLLARS TO YOUR CARD 7HEN YOU LOAD THE CARD

(131) "ITCOIN IS CONVERTED at the current price, and the dollar amount is stored in your account. /NCE THE CARD IS LOADED

(132) "ITCOIN IS OUT OF THE EQUATION

(133) AND FLUCTUATING PRICES DONT MATTER )F YOU WANT TO KNOW EXACTLY HOW MUCH MONEY YOU have on your card, the BitPay card is the way to go. )N CONTRAST

(134) THE 3HIFT CARD DOESNT HAVE ANY MONEY LOADED ONTO IT 2ATHER

(135) THE 3HIFT CARD CONNECTS TO A #OINBASE ACCOUNT

(136) AND AT THE TIME OF PURCHASE

(137) YOUR "ITCOIN IS CONVERTED TO 53 DOLLARS 4HIS IS ACTUALLY hCLEANERv THAN THE "IT0AY METHOD

(138) BUT THE VOLATILITY OF "ITCOIN CAN MEAN YOUR ACTUAL AVAILABLE MONEY ISNT CONSISTENT )F "ITCOIN TANKS

(139) SO DOES YOUR BUYING ABILITY WITH THE 3HIFT CARD %ACH CARD WAS  TO BUY

(140) AND NEITHER HAS AN ONGOING FEE TO USE 4HE TRANSACTIONS DONT COST ANYTHING

(141) AND THE ONLY FEES ARE WHEN ONE OF THE CARDS IS USED AT AN !4- TO GET CASH #ONSIDERING THAT YOU INSTANTLY CAN GET CASH FROM AN !4- FROM "ITCOIN

(142) HOWEVER

(143) THE SMALL FEE ASSOCIATED WITH THE PROCESS ISNT TOO DIFFICULT TO ACCEPT )F YOUVE BEEN AVOIDING DIGITAL CURRENCY BECAUSE YOU DONT HAVE ANY WAY TO SPEND IT

(144) ) URGE YOU TO CHECK OUT ONE OR BOTH OF THESE CARDS 4HERE ARE OTHER OPTIONS

(145) BUT THESE SEEMED LIKE THE BEST DEAL

(146) AND )VE personally used both. —Shawn Powers. 23 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 23. 4/25/17 2:56 PM.

(147) UPFRONT. Gaming Like It’s 1989 )TS NO SECRET THAT ) LOVE CLASSIC GAMING )T SEEMS LIKE EVERY OTHER MONTH

(148) ) WRITE ABOUT AN EMULATION PROJECT OR SOME ONLINE VERSION OF A S CLASSIC 4HE SYSTEM THAT DEFINED MY YOUTH WAS THE .INTENDO %NTERTAINMENT 3YSTEM

(149) OR THE .%3 )TS CHUNKY RECTANGLE CONTROLLER AND TWO BUTTON SETUP MAY SEEM SIMPLE TODAY

(150) BUT BACK THEN

(151) IT WAS REVOLUTIONARY -Y HANDS STILL EVEN FORM TO THE AWKWARD CONTROLLERS automatically like they did back in middle school. +NOWING THAT PEOPLE LIKE ME EXIST

(152) AND THAT WERE NOW OLD ENOUGH TO BUY THINGS

(153) .INTENDO RECENTLY RELEASED ITS .%3 #LASSIC %DITION !LTHOUGH THEYRE STILL ABSURDLY HARD TO FIND

(154) ) MANAGED TO BUY ONE !ND FOR ANYONE WONDERING WHETHER THE TINY REPLICA IS WORTH THE  OKAY

(155) ) PAID 

(156) IF THE .%3 DEFINED YOUR YOUTH

(157) ) WOULD SAY yes ) WAS WORRIED THE CONTROLLER WOULDNT FEEL LIKE THE ORIGINAL ) READ A FEW REVIEWS THAT SAID THEY FELT TOO LIGHT OR CHEAPER THAN THE OLD 24 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 24. 4/25/17 2:56 PM.

(158) UPFRONT. ONES 7ELL

(159) ) HAVE BOTH ORIGINAL CONTROLLERS FOR MY EMULATION MACHINE THAT ) WROTE ABOUT A FEW MONTHS BACK AND THE CONTROLLER THAT CAME WITH THE .%3 #LASSIC %DITION

(160) AND ) CAN SAY THEY BOTH FEEL ABOUT THE SAME !LSO

(161) ALTHOUGH THE GAMEPLAY ISNT ANY DIFFERENT ON THE .%3 #LASSIC %DITION VERSUS MY EMULATION MACHINE

(162) )M ACTUALLY QUITE HAPPY TO PAY FOR THE hPROPERv DEVICE AND GIVE .INTENDO MONEY ) KNOW 2/-S ARE EASY TO FIND

(163) BUT THE ONLY REASON ) DOWNLOAD THEM ILLEGITIMATELY IS THAT ) CANT BUY THEM LEGALLY .OW

(164) AT LEAST FOR  OF THE BEST GAMES

(165) ) CAN )M A HACKER AT HEART

(166) SO ALTHOUGH ) URGE YOU TO BUY THE .%3 #LASSIC IF YOURE INTO THAT SORT OF GAMING

(167) ) ALSO WANT TO PLAY A FEW GAMES THAT ARE NOT INCLUDED 4HANKFULLY

(168) THE .%3 #LASSIC IS SUPER EASY TO HACK )TS POSSIBLE TO HACK THE DEVICE TO ADD 2/-S MANUALLY

(169) BUT THERES ALSO A GREAT OPEN SOURCE TOOL CALLED HAKCHI THAT WILL DO ALL THE HEAVY LIFTING FOR YOU &ROM WHAT ) CAN TELL

(170) ITS A 7 INDOWS ONLY PROGRAM

(171) BUT IF YOU WANT A SIMPLE WAY TO ADD A FEW 2/-S

(172) ITS THE way to go: HTTPSGITHUBCOM#LUSTER-HAKCHI. 4HE HARDEST PART &INDING AN .%3 #LASSIC %DITION IN STOCK 'OOD LUCK FELLOW GAMERS —Shawn Powers. 25 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 25. 4/25/17 2:56 PM.

(173) UPFRONT. Image Processing on Linux )VE LOOKED AT SEVERAL SCIENTIFIC PACKAGES IN THIS SPACE THAT GENERATE NICE GRAPHICAL REPRESENTATIONS OF YOUR DATA AND WORK

(174) BUT )VE NOT gone in the other direction much. So in this article, I cover a popular IMAGE PROCESSING PACKAGE CALLED )MAGE* 3PECIFICALLY

(175) ) AM LOOKING AT &IJI https://imagej.net/Fiji

(176) AN INSTANCE OF )MAGE* BUNDLED WITH A SET OF PLUGINS THAT ARE USEFUL FOR SCIENTIFIC IMAGE PROCESSING 4HE NAME &IJI IS A RECURSIVE ACRONYM

(177) MUCH LIKE '.5 )T STANDS FOR h&IJI )S *UST )MAGE*v )MAGE* IS A USEFUL TOOL FOR ANALYZING IMAGES IN SCIENTIFIC RESEARCHˆFOR EXAMPLE

(178) YOU MAY USE IT FOR CLASSIFYING TREE TYPES IN A LANDSCAPE FROM AERIAL PHOTOGRAPHY )MAGE* CAN DO THAT TYPE CATEGORIZATION )TS BUILT WITH A PLUGIN ARCHITECTURE

(179) AND A VERY EXTENSIVE COLLECTION OF PLUGINS IS AVAILABLE TO INCREASE THE AVAILABLE FUNCTIONALITY 4HE FIRST STEP IS TO INSTALL )MAGE* OR &IJI  -OST DISTRIBUTIONS WILL HAVE A PACKAGE AVAILABLE FOR )MAGE* )F YOU WISH

(180) YOU CAN INSTALL IT THAT WAY AND THEN INSTALL THE INDIVIDUAL PLUGINS YOU NEED FOR YOUR RESEARCH 4HE other option is to install Fiji and get the most commonly used plugins AT THE SAME TIME 5NFORTUNATELY

(181) MOST ,INUX DISTRIBUTIONS WILL NOT HAVE A PACKAGE AVAILABLE WITHIN THEIR PACKAGE REPOSITORIES FOR &IJI ,UCKILY

(182) HOWEVER

(183) AN EASY INSTALLATION FILE IS AVAILABLE FROM THE MAIN WEBSITE )TS A SIMPLE ZIP FILE

(184) CONTAINING A DIRECTORY WITH ALL OF THE FILES REQUIRED TO RUN &IJI 7HEN YOU FIRST START IT

(185) YOU GET ONLY A SMALL TOOLBAR WITH A LIST OF MENU ITEMS &IGURE  . Figure 1. You get a very minimal interface when you first start Fiji. 26 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 26. 4/25/17 2:56 PM.

(186) UPFRONT )F YOU DONT ALREADY HAVE SOME IMAGES TO USE AS YOU ARE LEARNING TO work with ImageJ, the Fiji installation includes several sample images. Click the FileA/PEN 3AMPLES MENU ITEM FOR A DROPDOWN LIST OF SAMPLE IMAGES &IGURE   4HESE SAMPLES COVER MANY OF THE POTENTIAL TASKS YOU. Figure 2. Several sample images are available that you can use as you learn how to work with ImageJ. 27 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 27. 4/25/17 2:56 PM.

(187) UPFRONT might be interested in working on. )F YOU INSTALLED &IJI

(188) RATHER THAN )MAGE* ALONE

(189) A LARGE SET OF PLUGINS ALREADY WILL BE INSTALLED 4HE FIRST ONE OF NOTE IS THE AUTOUPDATER PLUGIN 4HIS PLUGIN CHECKS THE INTERNET FOR UPDATES TO )MAGE*

(190) AS WELL as the installed plugins, each time ImageJ is started. !LL OF THE INSTALLED PLUGINS ARE AVAILABLE UNDER THE 0LUGINS MENU ITEM /NCE YOU HAVE INSTALLED A NUMBER OF PLUGINS

(191) THIS LIST CAN become a bit unwieldy, so you may want to be judicious in your PLUGIN SELECTION )F YOU WANT TO TRIGGER THE UPDATES MANUALLY

(192) CLICK the HelpA5PDATE &IJI MENU ITEM TO FORCE THE CHECK AND GET A LIST OF AVAILABLE UPDATES &IGURE   .OW

(193) WHAT KIND OF WORK CAN YOU DO WITH &IJI)MAGE* /NE EXAMPLE IS DOING COUNTS OF OBJECTS WITHIN AN IMAGE 9OU CAN LOAD A SAMPLE BY clicking FileAOpen SamplesA%MBRYOS. Figure 3. You can force a manual check of what updates are available. 28 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 28. 4/25/17 2:56 PM.

(194) UPFRONT 4HE FIRST STEP IS TO SET A SCALE TO THE IMAGE SO YOU CAN TELL )MAGE* HOW TO IDENTIFY OBJECTS &IRST

(195) SELECT THE LINE BUTTON ON THE TOOLBAR AND DRAW A LINE OVER THE LENGTH OF THE SCALE LEGEND ON THE IMAGE You then can select AnalyzeASet Scale, and it will set the number OF PIXELS THAT THE SCALE LEGEND OCCUPIES &IGURE   9OU CAN SET THE KNOWN DISTANCE TO BE  AND THE UNITS TO BE hUMv 4HE NEXT STEP IS TO SIMPLIFY THE INFORMATION WITHIN THE IMAGE #LICK ImageA4YPEA BIT TO REDUCE THE INFORMATION TO AN  BIT GRAY SCALE IMAGE 4O ISOLATE THE INDIVIDUAL OBJECTS

(196) CLICK 0ROCESSABinaryAMake "INARY TO THRESHOLD THE IMAGE AUTOMATICALLY &IGURE   "EFORE YOU CAN COUNT THE OBJECTS WITHIN THE IMAGE

(197) YOU NEED TO REMOVE ARTIFACTS LIKE THE SCALE LEGEND 9OU CAN DO THAT BY USING THE. Figure 4. With ImageJ, you can count objects within an image. 29 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 29. 4/25/17 2:56 PM.

(198) UPFRONT Figure 5. For many image analysis tasks, you need to set a scale to the image.. Figure 6. There are tools to do automatic tasks like thresholding.. 30 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 30. 4/25/17 2:56 PM.

(199) UPFRONT RECTANGULAR SELECTION TOOL TO SELECT IT AND THEN CLICK %DITAClear. Now you can analyze the image and see what objects are there. Making sure that there are no areas selected in the image, click AnalyzeAAnalyze Particles to pop up a window where you can select the minimum size, what results to display and what to show in the FINAL IMAGE &IGURE   Figure 8 shows an overall look at what was discovered in the SUMMARY RESULTS WINDOW 4HERE IS ALSO A DETAILED RESULTS WINDOW FOR each individual particle. /NCE YOU HAVE AN ANALYSIS WORKED OUT FOR A GIVEN IMAGE TYPE

(200) YOU OFTEN NEED TO APPLY THE EXACT SAME ANALYSIS TO A SERIES OF IMAGES 4HIS SERIES MAY NUMBER INTO THE THOUSANDS

(201) SO ITS TYPICALLY NOT SOMETHING YOU WILL. Figure 7. You can generate a reduced image with identified particles. 31 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 31. 4/25/17 2:56 PM.

(202) UPFRONT WANT TO REPEAT MANUALLY FOR EACH IMAGE )N SUCH CASES

(203) YOU CAN COLLECT THE REQUIRED STEPS TOGETHER INTO A MACRO SO THAT THEY CAN BE REAPPLIED multiple times. Clicking PluginsAMacrosARecord pops up a new window WHERE ALL OF YOUR SUBSEQUENT COMMANDS WILL BE RECORDED /NCE ALL OF THE STEPS ARE FINISHED

(204) YOU CAN SAVE THEM AS A MACRO FILE AND RERUN THEM ON other images by clicking PluginsAMacrosARun. )F YOU HAVE A VERY SPECIFIC SET OF STEPS FOR YOUR WORKFLOW

(205) YOU SIMPLY CAN OPEN THE MACRO FILE AND EDIT IT BY HAND

(206) AS IT IS A SIMPLE TEXT FILE 4HERE IS ACTUALLY A COMPLETE MACRO LANGUAGE AVAILABLE TO YOU TO CONTROL THE PROCESS THAT IS BEING APPLIED TO YOUR IMAGES MORE FULLY )F YOU HAVE A REALLY LARGE SET OF IMAGES THAT NEEDS TO BE PROCESSED

(207) HOWEVER

(208) THIS STILL MIGHT BE TOO TEDIOUS FOR YOUR WORKFLOW )N THAT CASE

(209) go to ProcessABatchAMacro to pop up a new window where you can SET UP YOUR BATCH PROCESSING WORKFLOW &IGURE   &ROM THIS WINDOW

(210) YOU CAN SELECT WHICH MACRO FILE TO APPLY

(211) THE source directory where the input images are located and the output. Figure 8. One of the output results includes a summary list of the particles identified. 32 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 32. 4/25/17 2:56 PM.

(212) UPFRONT directory where you want the output images to be written. You also CAN SET THE OUTPUT FILE FORMAT AND FILTER THE LIST OF IMAGES BEING USED AS INPUT BASED ON WHAT THE FILENAME CONTAINS /NCE EVERYTHING IS done, start the batch run by clicking the Process button at the bottom OF THE WINDOW )F THIS IS A WORKFLOW THAT WILL BE REPEATED OVER TIME

(213) YOU CAN SAVE. Figure 9. You can run a macro on a batch of input image files with a single command. 33 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 33. 4/25/17 2:56 PM.

(214) UPFRONT THE BATCH PROCESS TO A TEXT FILE BY CLICKING THE 3AVE BUTTON AT THE BOTTOM OF THE WINDOW 9OU THEN CAN RELOAD THE SAME WORKFLOW BY CLICKING THE /PEN BUTTON

(215) ALSO AT THE BOTTOM OF THE WINDOW !LL OF THIS FUNCTIONALITY ALLOWS YOU TO AUTOMATE THE MOST TEDIOUS PARTS OF YOUR RESEARCH SO YOU CAN FOCUS ON THE ACTUAL SCIENCE Considering that there are more than 500 PLUGINS AND MORE THAN  MACROS AVAILABLE FROM THE MAIN )MAGE* WEBSITE ALONE

(216) IT IS AN UNDERSTATEMENT THAT )VE BEEN ABLE TO TOUCH ON ONLY THE MOST BASIC OF TOPICS IN THIS SHORT ARTICLE ,UCKILY

(217) MANY DOMAIN SPECIFIC tutorials are available, along with the very GOOD DOCUMENTATION FOR THE CORE OF )MAGE* FROM THE MAIN PROJECT WEBSITE )F YOU THINK THIS TOOL COULD BE OF USE TO YOUR RESEARCH

(218) THERE IS A WEALTH OF INFORMATION TO GUIDE YOU IN YOUR PARTICULAR AREA OF STUDY —Joey Bernard. THEY SAID IT Everything that I understand, I understand only because I love. —Leo Tolstoy. The human brain is unique in that it is the only container of which it can be said that the more you put into it, the more it will hold. —Glenn Doman. Fear does not have any special power unless you empower it by submitting to it. —Les Brown. Train yourself to let go of the things you fear to lose. —George Lucas. RETURN TO CONTENTS. Let us so live that when we come to die even the undertaker will be sorry. —Mark Twain. 34 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 34. 4/25/17 2:56 PM.

(219) Where every interaction matters.. break down your innovation barriers power your business to its full potential When you’re presented with new opportunities, you want to focus on turning them into successes, not whether your IT solution can support them.. Peer 1 Hosting powers your business with our wholly owned FastFiber NetworkTM, solutions that are secure, scalable, and customized for your business. Unsurpassed performance and reliability help build your business foundation to be rock-solid, ready for high growth, and deliver the fast user experience your customers expect.. Want more on cloud? Call: 844.855.6655 | go.peer1.com/linux | Vew Cloud Webinar:. Public and Private Cloud. LJ277-May2017.indd 35. |. Managed Hosting. |. Dedicated Hosting. |. Colocation. 4/25/17 2:56 PM.

(220) PREVIOUS UpFront. NEXT Reuven M. Lerner’s At the Forge. V. V. EDITORS’ CHOICE. Non-Linux FOSS: EDITORS’ CHOICE How to Make ★ Windows Better? Make It Chocolatey! ™. /NCE AGAIN

(221) MY FRIEND AND FELLOW Linux Journal club member Kris /CCHIPINTI INTRODUCED ME TO AN AWESOME BIT OF SOFTWARE 4HIS TIME

(222) ITS AN OPEN SOURCE PROJECT THAT BRINGS ,INUX LIKE PACKAGE MANAGEMENT TO 7 INDOWS $ONT GET ME WRONG INSTALLING SOFTWARE ON 7 INDOWS ISNT DIFFICULT

(223) BUT ITS DEFINITELY MORE CUMBERSOME THAN WITH ,INUX 0LUS

(224) WITH #HOCOLATEY http://chocolatey.org

(225) YOU CAN KEEP your installed packages up to date as easily as you can with Linux. 4HERE IS AN OPEN SOURCE VERSION OF #HOCOLATEY AND PAID VERSIONS 7ITH THE OPEN SOURCE VERSION

(226) YOU CAN INSTALL AND MAINTAIN ALL THE COMMUNITY PACKAGES

(227) WHICH FOR ME IS PLENTY ,ITERALLY THOUSANDS OF SOFTWARE PACKAGES ARE AVAILABLE TO INSTALL WITH A SIMPLE COMMAND LINE ENTRY !ND UNLIKE #YGWIN A WONDERFUL PROGRAM AS WELL

(228) #HOCOLATEY INSTALLS THE SAME 7INDOWS APPLICATIONS YOUD INSTALL IF YOU DOWNLOADED the installers and went through the process on your own. Installation on Windows can be done via the command prompt CMDEXE OR VIA 0OWERSHELL )F YOU OPEN THE COMMAND PROMPT AS ADMINISTRATOR RIGHT CLICK

(229) OPEN AS ADMINISTRATOR

(230) SEE SCREENSHOT

(231) 36 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 36. 4/25/17 2:56 PM.

(232) EDITORS' CHOICE. you can install with: @powershell  -­NoProfile  -­ExecutionPolicy  Bypass  -­Command      ´"iex  ((New-­Object  System.Net.WebClient).DownloadString   ´('https://chocolatey.org/install.ps1'))"  &&  SET      ´"PATH=%PATH%;;%ALLUSERSPROFILE%\chocolatey\bin". Or even better, visit https://chocolatey.org/install FOR MORE OPTIONS AND A CHANCE TO LOOK AT THE INSTALLATION SCRIPT BEFORE INSTALLING 4HE SITE ACTUALLY RECOMMENDS LOOKING AT THE INSTALLATION CODE BEFORE RUNNING IT TO MAKE SURE ITS SAFE 4HAT DOESNT MAKE ME LESS CONFIDENT OF THE CODE

(233) BUT IT makes me happy to see smart security choices. So, thanks to making Windows a bit more like Linux and easing the process OF KEEPING YOUR SOFTWARE UP TO DATE

(234) #HOCOLATEY EARNS THIS MONTHS %DITORS #HOICE AWARD )F YOU USE 7INDOWS

(235) HEAD OVER TO THE WEBSITE AND CHECK OUT THIS AWESOME SYSTEM )TS ESPECIALLY USEFUL FOR BRAND NEW 7INDOWS INSTALLS

(236) BECAUSE MANAGING ALL YOUR THIRD PARTY SOFTWARE WITH A SINGLE TOOL IS WONDERFUL RETURN TO CONTENTS 4HANKS AGAIN

(237) +RIS—Shawn Powers 37 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 37. 4/25/17 2:56 PM.

(238) AT THE FORGE. Novelty and Outlier Detection. REUVEN M. LERNER Reuven M. Lerner, a longtime Web developer,. Which of these data points doesn’t belong? Machine learning can tell you.. offers training and consulting services in Python, Git, PostgreSQL and data science. He has written two programming. PREVIOUS Editors’ Choice. NEXT Dave Taylor’s Work the Shell. Python and Practice Makes Regexp) and publishes. V. V. ebooks (Practice Makes. a free weekly newsletter for programmers, at http://lerner.co.il/ newsletter. Reuven tweets at @reuvenmlerner and lives in Modi’in, Israel, with. IN MY THE LAST FEW ARTICLES, )VE LOOKED AT A NUMBER OF WAYS MACHINE LEARNING CAN HELP MAKE PREDICTIONS 4HE BASIC IDEA IS THAT YOU CREATE A model using existing data and then ask that model to predict an outcome based on new data. 3O

(239) ITS NOT SURPRISING THAT ONE OF THE MOST amazing ways machine learning is being applied IS IN PREDICTING THE FUTURE *UST A FEW DAYS BEFORE writing this piece, it was announced that machine learning models actually might be able to predict EARTHQUAKESˆA GOAL THAT HAS ELUDED SCIENTISTS FOR MANY YEARS AND THAT HAS THE POTENTIAL TO SAVE THOUSANDS

(240) AND MAYBE EVEN MILLIONS

(241) OF LIVES. his wife and three children.. 38 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 38. 4/25/17 2:56 PM.

(242) AT THE FORGE "UT AS YOUVE ALSO SEEN

(243) MACHINE LEARNING CAN BE USED TO hCLUSTERv DATAˆTHAT IS

(244) TO FIND PATTERNS THAT HUMANS EITHER CANT OR WONT SEE

(245) AND TO TRY TO PUT THE DATA INTO VARIOUS hCLUSTERSv

(246) OR MACHINE DRIVEN CATEGORIES By asking the computer to divide data into distinct groups, you gain the OPPORTUNITY TO FIND AND MAKE USE OF PREVIOUSLY UNDETECTED PATTERNS *UST AS CLUSTERING CAN BE USED TO DIVIDE DATA INTO A NUMBER OF coherent groups, it also can be used to decide which data points belong INSIDE A GROUP AND WHICH DONT )N hNOVELTY DETECTIONv

(247) YOU HAVE A DATA SET THAT CONTAINS ONLY GOOD DATA

(248) AND YOURE TRYING TO DETERMINE WHETHER NEW OBSERVATIONS FIT WITHIN THE EXISTING DATA SET )N hOUTLIER DETECTIONv

(249) THE DATA MAY CONTAIN OUTLIERS

(250) WHICH YOU WANT TO IDENTIFY 7HERE COULD SUCH DETECTION BE USEFUL #ONSIDER JUST A FEW QUESTIONS you could answer with such a system: Q !RE THERE AN UNUSUAL AMOUNT OF LOGIN ATTEMPTS FROM A PARTICULAR. IP address? Q !RE ANY CUSTOMERS BUYING MORE THAN THE TYPICAL NUMBER OF PRODUCTS. at a given hour? Q 7HICH HOMES ARE CONSUMING ABOVE AVERAGE AMOUNTS OF WATER. during a drought? Q 7HICH JUDGES CONVICT AN UNUSUAL NUMBER OF DEFENDANTS Q 3HOULD A PATIENTS BLOOD TESTS BE CONSIDERED NORMAL

(251) OR ARE THERE. OUTLIERS THAT REQUIRE FURTHER CHECKS AND EXAMINATIONS )N ALL OF THOSE CASES

(252) YOU COULD SET THRESHOLDS FOR MINIMUM AND maximum values and then tell the computer to use those thresholds IN DETERMINING WHATS SUSPICIOUS "UT MACHINE LEARNING CHANGES THAT AROUND

(253) LETTING THE COMPUTER FIGURE OUT WHAT IS CONSIDERED hNORMALv AND THEN IDENTIFY THE ANOMALIES

(254) WHICH HUMANS THEN CAN INVESTIGATE 4HIS ALLOWS PEOPLE TO CONCENTRATE THEIR ENERGIES ON UNDERSTANDING WHETHER THE OUTLIERS ARE INDEED PROBLEMATIC

(255) RATHER THAN ON IDENTIFYING THEM IN THE FIRST PLACE 39 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 39. 4/25/17 2:56 PM.

(256) AT THE FORGE 3O IN THIS ARTICLE

(257) ) LOOK AT A NUMBER OF WAYS YOU CAN TRY TO IDENTIFY OUTLIERS USING THE TOOLS AND LIBRARIES THAT 0YTHON PROVIDES FOR WORKING WITH DATA .UM0Y

(258) 0ANDAS AND SCIKIT LEARN *UST WHICH TECHNIQUE AND TOOLS WILL BE APPROPRIATE FOR YOUR DATA DEPEND ON WHAT YOURE DOING

(259) but the basic theory and practice presented here should at least PROVIDE YOU WITH SOME FOOD FOR THOUGHT. Finding Anomalies (UMANS ARE EXCELLENT AT FINDING PATTERNS

(260) AND THEYRE ALSO QUITE GOOD AT FINDING THINGS THAT DONT FIT A PATTERN "UT

(261) WHAT SORT OF ALGORITHM CAN LOOK AT A GROUP OF DATA SETS AND FIGURE OUT WHICH IS UNLIKE THE OTHERS /NE SIMPLE WAY TO DO THIS IS TO SET A CUTOFF

(262) OFTEN DONE AT ONE OR TWO STANDARD DEVIATIONS &OR THOSE OF YOU WITHOUT A BACKGROUND IN STATISTICS OR WHO HAVE FORGOTTEN WHAT A hSTANDARD DEVIATIONv IS

(263) ITS A MEASUREMENT OF HOW SPREAD OUT THE DATA IS &OR EXAMPLE >>>  a  =  np.array([10,10,10,10,10,10,10])   >>>  print("std  =  {},  mean  =  {}".format(a.std(),  a.mean()))     std  =  0.0,  mean  =  10.0. In the above example, I have a NumPy array containing seven instances OF THE NUMBER TEN 0EOPLE OFTEN THINK OF THE MEAN AS DESCRIBING THE DATA

(264) AND IT DOES

(265) BUT ITS ONLY WHEN COMBINED WITH THE STANDARD DEVIATION THAT YOU CAN KNOW HOW MUCH THE NUMBERS DIFFER FROM ONE ANOTHER )N THIS CASE

(266) THEYRE ALL IDENTICAL

(267) SO THE STANDARD DEVIATION IS  In this example, the mean remains the same, but the standard deviation IS QUITE DIFFERENT >>>  a  =  np.array([5,15,0,20,-­5,25,10])   >>>  print("std  =  {},  mean  =  {}".format(a.std(),  a.mean()))     std  =  10.0,  mean  =  10.0. Here, the mean has not changed, but the standard deviation has. You CAN SEE

(268) FROM JUST THOSE TWO NUMBERS

(269) THAT ALTHOUGH THE NUMBERS REMAIN 40 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 40. 4/25/17 2:56 PM.

(270) AT THE FORGE CENTERED AROUND 

(271) THEY ALSO ARE SPREAD OUT QUITE A BIT /NE SIMPLE WAY TO DETECT UNUSUAL DATA IS TO LOOK FOR ALL OF THE VALUES THAT LIE OUTSIDE OF TWO STANDARD DEVIATIONS FROM THE MEAN

(272) WHICH ACCOUNTS FOR ABOUT  OF THE DATA 9OU CAN GO FURTHER OUT IF YOU WANT  OF DATA POINTS ARE WITHIN THREE STANDARD DEVIATIONS

(273) AND  ARE WITHIN FOUR )F YOURE LOOKING FOR OUTLIERS IN AN EXISTING DATA SET

(274) YOU can do something like this: >>>  a  =  np.array([-­5,15,0,20,-­5,25,1000])   >>>  print(a.std())     347.19282415231044     >>>  min_cutoff  =  a.mean()  -­  a.std()*2   >>>  max_cutoff  =  a.mean()  +  a.std()*2     >>>  print(a[(a<min_cutoff)  |  (a>max_cutoff)])     array([1000]). 3URE ENOUGH

(275) THAT FOUND AN OUTLIER IN THE DATA )TS EVEN EASIER IF YOU HAVE A BUNCH OF NEW DATA AND WANT TO DETERMINE WHETHER THOSE VALUES WOULD FIT INSIDE OR OUTSIDE YOUR EXISTING DATA SET >>>  new_data  =  np.array([-­5000,  -­3000,  -­1000,  -­500,  20,  60,  500,  800,   >>>  900])   >>>  print(new_data[(new_data<min_cutoff)  |  (new_data>max_cutoff)])     array([-­5000,  -­3000,  -­1000,      900]). 4HE GOOD NEWS IS THAT THIS IS SIMPLEˆSIMPLE TO UNDERSTAND

(276) SIMPLE TO implement and simple to automate. (OWEVER

(277) ITS ALSO TOO SIMPLE FOR MOST DATA 9OURE UNLIKELY TO BE LOOKING AT A SINGLE DIMENSIONAL VECTOR 4HE BASELINE MEAN IS LIKELY TO SHIFT OVER TIME !ND BESIDES

(278) THERE MUST BE OTHER

(279) BETTER WAYS TO MEASURE whether something is “inside” or “outside”, right? 41 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 41. 4/25/17 2:56 PM.

(280) AT THE FORGE. Getting More Sophisticated &OR REAL WORLD ANOMALY DETECTION

(281) YOURE GOING TO NEED TO IMPROVE ON A FEW FRONTS 9OULL NEED TO CONSIDER THE DATA AND DETERMINE WHATS hINv AND WHATS hOUTv 9OULL ALSO NEED TO FIGURE OUT WAYS TO EVALUATE your model. ,ETS CONSIDER NOVELTY DETECTION THERE IS INITIAL DATA

(282) AND YOU WANT TO KNOW IF A NEW PIECE OF DATA WOULD FIT INSIDE THE EXISTING DATA OR IF IT WOULD BE CONSIDERED AN OUTLIER &OR EXAMPLE

(283) CONSIDER A PATIENT WHO COMES IN WITH VALUES FROM A BLOOD TEST $O THOSE TESTS INDICATE THAT THE PATIENT IS NORMAL

(284) BECAUSE THE DATAS VALUES ARE SIMILAR TO THE ONES YOUVE ALREADY SEEN /R ARE THOSE NEW VALUES STATISTICAL OUTLIERS

(285) indicating that the patient needs additional attention? In order to experiment with novelty and outlier detection, I downloaded HISTORIC PRECIPITATION DATA FOR AN AREA OF 0ENNSYLVANIA 7YNCOTE

(286) JUST OUTSIDE 0HILADELPHIA

(287) FOR EVERY DAY IN  "ECAUSE )M A SCIENTIFIC KIND OF GUY

(288) ) DOWNLOADED THE DATA IN METRIC UNITS 4HE DATA CAME FROM THE 53 GOVERNMENT

(289) AT HTTPSWWWCLIMATEGOVMAPS DATADATASET PAST WEATHER ZIP CODE DATA TABLE. 4HAT SITE CONTAINS CLEAR INSTRUCTIONS FOR DOWNLOADING DATA FROM HERE HTTPSWWWNCDCNOAAGOVCDO WEBDATASETS. )TS QUITE AMAZING WHAT GOVERNMENT DATA IS FREELY AVAILABLE

(290) AND THE SORTS OF ANALYSIS YOU CAN DO WITH IT ONCE YOUVE RETRIEVED IT ) DOWNLOADED THE DATA AS A #36 FILE AND THEN USED 0ANDAS TO READ IT INTO A DATA FRAME >>>  df  =  pd.read_csv('/Users/reuven/downloads/914914.csv',          usecols=['PRCP',  'DATE']). .OTICE THAT ) WAS INTERESTED ONLY IN 02#0 PRECIPITATION AND $!4% THE DATE

(291) IN 9999--$$ FORMAT  ) THEN MANIPULATED THE DATA TO BREAK APART THE $!4% COLUMN AND THEN TO REMOVE IT >>>  df['DATE']  =  df['DATE'].astype(np.str)   >>>  df['MONTH']  =  df['DATE'].str[4:6].astype(np.int8)   >>>  df['DAY']  =  df['DATE'].str[6:8].astype(np.int8)   >>>  df.drop('DATE',  inplace=True,  axis=1) 42 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 42. 4/25/17 2:56 PM.

(292) AT THE FORGE 7HY WOULD ) BREAK THE DATE APART "ECAUSE ITLL LIKELY BE EASIER FOR models to work with three separate numeric columns, rather than a single DATE TIME COLUMN "ESIDES

(293) HAVING THESE COLUMNS AS PART OF MY MODEL WILL make it easier to understand whether snow in July is abnormal. I ignore THE YEAR

(294) SINCE ITS THE SAME FOR EVERY RECORD

(295) WHICH MEANS THAT IT CANT help me as a predictor in this model. -Y DATA FRAME NOW CONTAINS  ROWSˆ)M NOT SURE WHY ITS NOT ˆ OF DATA FROM 

(296) WITH COLUMNS INDICATING THE AMOUNT OF RAIN IN MM

(297) the date and the month. "ASED ON THIS

(298) HOW CAN YOU BUILD A MODEL TO INDICATE WHETHER RAINFALL on a given day is normal or an outlier? )N SCIKIT LEARN

(299) YOU ALWAYS USE THE SAME METHOD YOU IMPORT THE ESTIMATOR CLASS

(300) CREATE AN INSTANCE OF THAT CLASS AND THEN FIT THE MODEL )N THE CASE OF SUPERVISED LEARNING

(301) hFITTINGv MEANS TEACHING THE MODEL WHICH INPUTS GO WITH WHICH OUTPUTS )N THE CASE OF UNSUPERVISED LEARNING

(302) WHICH )M DOING HERE

(303) YOU USE hFITv WITH JUST A SET OF INPUTS

(304) ALLOWING THE model to distinguish between inliers and outliers.. Creating a Model )N THE CASE OF THIS DATA

(305) THERE ARE SEVERAL TYPES OF MODELS THAT ) CAN BUILD ) EXPERIMENTED A BIT AND FOUND THAT THE IsolationForest estimator GAVE ME THE BEST RESULTS (ERES HOW ) CREATE AND TRAIN THE MODEL >>>  from  sklearn.ensemble  import  IsolationForest   >>>  model  =  IsolationForest()   >>>  model.fit(df). 4HE MODEL NOW HAS BEEN TRAINED

(306) SO ) CAN FIND OUT WHETHER A GIVEN AMOUNT OF RAIN

(307) ON A CERTAIN MONTH AND DAY

(308) IS CONSIDERED NORMAL 4O TRY THINGS OUT

(309) ) CHECK THE MODEL AGAINST ITS OWN INPUTS >>>  Series(model.predict(df)).value_counts(). In the above code, I run model.predict(df)  4HIS GIVES THE INPUTS to the model and asks it to predict whether these are normal, expected VALUES INDICATED BY  OR OUTLIER VALUES INDICATED BY n  "Y TURNING THE 43 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 43. 4/25/17 2:56 PM.

(310) AT THE FORGE result into a Pandas series and then calling value_counts , I see:  1        317   -­1          36. !LTHOUGH IT FALSELY MARKED  DAYS AS OUTLIERS

(311) MAYBE THOSE DAYS WERE UNUSUAL 4HE MODEL CERTAINLY WOULD BE IMPROVED IF IT HAD MULTIPLE YEARS WORTH OF DATA

(312) RATHER THAN JUST ONE YEARS WORTH Now what? I can ask the system to make some predictions: for  i  in  range(1,  13):          print(model.predict([[15,  i,  16]])). 4HIS WILL TELL WHETHER ITS NORMAL TO GET  MM RAIN ON THE TH OF EACH MONTH 4HE CONCLUSION OF THE MODEL YES

(313) ITS PERFECTLY NORMAL IN &EBRUARYn*ULY

(314) BUT NOT SO IN !UGUSTn*ANUARY 7HAT ABOUT IF THERES zero precipitation: for  i  in  range(1,  13):          print(model.predict([[0,  i,  16]])). )T TURNS OUT THAT NO MATTER WHAT MONTH

(315) ITS NEVER AN OUTLIER TO HAVE ZERO RAIN ON THE TH OF THE MONTH /F COURSE

(316) THOSE ARE JUST CRUDE TESTS 4HE REAL THING TO DO IS USE OUR OLD FRIEND train_test_split : >>>  from  sklearn.model_selection  import  train_test_split   >>>  X_train,  X_test  =  train_test_split(df)   >>>  model.fit(X_train)   >>>  Series(model.predict(X_test)).value_counts(). 4HE MODEL DID PRETTY WELL

(317) GIVEN THAT ) DIDNT EVEN TRY TO TUNE IT  1        77   -­1        12   dtype:  int64 44 | May 2017 | http://www.linuxjournal.com. LJ277-May2017.indd 44. 4/25/17 2:56 PM.

(318) AT THE FORGE )N OTHER WORDS

(319) GIVEN DATA THAT SHOULD ALL BE CLASSIFIED AS INLIERS

(320) YOU CAN SEE HERE THAT THE OVERWHELMING MAJORITY IS INDEED CLASSIFIED CORRECTLY 4HERE ARE OTHER TYPES OF ESTIMATORS YOU CAN USE AS WELL )N PARTICULAR

(321) THE /NE #LASS 36- ESTIMATOR HAS HAD A GOOD TRACK RECORD OF WORKING WITH INPUT DATA 4HAT

(322) COMBINED WITH A LARGER DATA SET

(323) MIGHT WELL IMPROVE THE RESULTS SHOWN ABOVEˆALTHOUGH IN TRYING /NE #LASS 36- FOR THIS ARTICLE

(324) ) DIDNT SEE ANY SUCH RESULTS )TS POSSIBLE THAT IF ) WERE TO ADD SEVERAL MORE YEARS WORTH OF DATA

(325) OTHER ESTIMATORS WOULD WORK BETTER. Conclusion .OVELTY AND OUTLIER DETECTION IS YET ANOTHER LARGE

(326) EXCITING AND GROWING USE FOR MACHINE LEARNING !S USUAL WITH MACHINE LEARNING

(327) THE PROBLEM IS NOT ONE OF CODING

(328) BUT RATHER OF MASSAGING THE DATA INTO A FORMAT THAT YOU CAN USE

References

Related documents

Section 3.4, the derivation of a deterministic equivalent of the mutual informa- tion of the K-hop amplify-and-forward (AF) MIMO relay channel in Section 3.5, and the

Introduction Methodology Results Conclusion Experimental Design Dimensions Examples Evaluation Pop: most similar words. pop music

• Delivery site types (e.g., senior centers/AAA, residential facilities, healthcare organizations, community/multi-purpose facilities, faith-based organizations)...  The majority

If the notches extracted from the subject’s pinna image are to be compared with a set of HRTFs taken from a database, various notch distance metrics can be defined based on

In his study of HPWS in the US small business sector Way (2002) focused on the HRM practices extensiveness of staffing, performance based pay, pay level, job rotation,

Freight container An article of transport equipment that is of a permanent character and accordingly strong enough to be suitable for repeated use; specially designed

The results of the linear mixed-effects model analysis with the fixed effects CONDITION (local vs. global motion) and GROUP (VIMS vs. healthy groups) for the

The diagnostic maneuver used to elicit BPPV emanating from a horizontal canal is the same used to look for positional nystagmus: the supine patient’s head is rotated first to