/* Objective: Since Windows 10 File Explorer search seems messed-up on my laptop (and computers of at least some others reporting online) since late 2019, try to make something for finding files/folders on my laptop while waiting for a fix! v7_i2 (version 7, intermediate/iteration two) 9Feb2020 --See description for i1. Otherwise, some rearrangements made, but I am resigned to not being able to toggle text on the Find button with having it disappear and other GUI glitches ...non-thread-safe-nature of Swing? Save this code and plan make new version trying to allow program/GUI exit via a new button instead Items that might be added/addressed later: see comments at bottom file */ import java.awt.BorderLayout; import java.awt.Color; import java.awt.Font; import java.awt.GridLayout; import java.awt.Insets; import java.io.FileInputStream; import java.io.IOException; import java.nio.charset.StandardCharsets; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import java.util.Arrays; import java.util.List; import java.util.stream.Stream; import javax.swing.BorderFactory; import javax.swing.JButton; import javax.swing.JFrame; import javax.swing.JLabel; import javax.swing.JPanel; import javax.swing.JScrollPane; import javax.swing.JTextArea; import javax.swing.SwingUtilities; import javax.swing.border.EmptyBorder; import org.apache.poi.hwpf.HWPFDocument; import org.apache.poi.hwpf.extractor.WordExtractor; import org.apache.poi.xwpf.usermodel.XWPFDocument; import org.apache.poi.xwpf.usermodel.XWPFParagraph; class FindFileOrFolder_v7_i2 { String missingStartMessage = ""; String missingTargetMessage = ""; int walkDepth = 1; // --specifies how many subdirectory levels to go down when collecting files to access // initialized to default 1 (do not look in subdirectories) boolean caseSensitive; // whether search term is treated as case-sensitive (default is no) boolean lookInside; // whether to search for target inside txt and cvs files also boolean doesUserWantToExit = true; // --v7 added 'to enable kill switch'...viable?--------------------------------------------------------- // Problem: with the addition of the Apache POI-based within dod/cocx text searching, // searches sometimes take many minutes, and CANNOT be terminated by clicking the GUI window's 'close' button! // This may be due to use of invokeLater method? // Declaring fields for GUI components... JLabel startFolderLabel = new JLabel("Enter the path of the folder from within which you want to (start your) search..."); JTextArea startFolderTextArea = new JTextArea(2, 60); // input to specify foolder within which to (start) searching JButton depthButton = new JButton("Search also in subfolders of the starting folder"); JButton withinFileButton = new JButton("Search also text within following file types: docx, doc, txt, cvs"); // --v7 added "docx, doc" JPanel howDeepPanel = new JPanel(new GridLayout()); // to hold the depthButton and withinFileButton JPanel wherePanel = new JPanel(new BorderLayout()); // to hold startFolderLabel, startFolderTextArea & howDeepPanel JLabel targetNameLabel = new JLabel("Enter the name or partial name of a file or folder you want to find..."); JTextArea targetNameTextArea = new JTextArea(2, 60); // input to specify file/folder names for which to search JButton caseButton = new JButton("Make search case-sensitive"); JPanel whatPanel = new JPanel(new BorderLayout()); // targetNameLabel, targetNameTextArea & caseButton JPanel inputsPanel = new JPanel(new BorderLayout()); // to hold wherePanel & whatPanel JButton findButton = new JButton("Find files/folders"); JTextArea displayTextArea = new JTextArea(40, 100); // displays output, i.e. paths for files/folders found JPanel findAndShowPanel = new JPanel(new BorderLayout()); // to hold findButton & displayTextArea JFrame frame = new JFrame("FindFileOrFolder"); // to hold all above (sub)panels FindFileOrFolder_v7_i2() // constructor, called when main method runs { startFolderLabel.setBorder(new EmptyBorder(10, 10, 0, 0)); startFolderTextArea.setLineWrap(true); startFolderTextArea.setMargin(new Insets(5, 10, 5, 10)); JScrollPane startSP = new JScrollPane(startFolderTextArea); startSP.setBorder(BorderFactory.createMatteBorder(2, 0, 0, 0, new Color(245, 245, 245))); depthButton.addActionListener(actionEvent -> { if (walkDepth == 1) { walkDepth = Integer.MAX_VALUE; depthButton.setText("Search in the starting folder only"); } else // (walkDepth is Integer.MAX_VALUE) { walkDepth = 1; depthButton.setText("Search in subfolders of the starting folder also"); } } ); // toggles walk dept between no-subfolders (starting state) and all-subfolders howDeepPanel.add(depthButton, BorderLayout.WEST); withinFileButton.addActionListener(actionEvent -> { if (lookInside == false) // (could rewrite as !lookInside) { lookInside = true; withinFileButton.setText("Revert to not searching also text within files (docx, doc, txt, cvs)"); // --v7 text updared } else // (lookInside is true) { lookInside = false; withinFileButton.setText("Revert to searching also text within following file types: docx, doc, txt, cvs"); // --v7 text updared } } ); // toggles between searching just file names and text within any txt/cvs files also howDeepPanel.add(withinFileButton, BorderLayout.EAST); wherePanel.setBackground(new Color(245, 245, 245)); wherePanel.add(startFolderLabel, BorderLayout.NORTH); wherePanel.add(startSP, BorderLayout.CENTER); wherePanel.add(howDeepPanel, BorderLayout.SOUTH); targetNameLabel.setBorder(new EmptyBorder(10, 10, 0, 0)); targetNameTextArea.setLineWrap(true); targetNameTextArea.setMargin(new Insets(5, 10, 5, 10)); JScrollPane targetNameSP = new JScrollPane(targetNameTextArea); targetNameSP.setBorder(BorderFactory.createMatteBorder(2, 0, 0, 0, new Color(245, 245, 245))); caseButton.addActionListener(actionEvent -> { if (caseSensitive == false) { caseSensitive = true; caseButton.setText("Make search case-insentitive again"); } else // (caseSensitive is true) { caseSensitive = false; caseButton.setText("Make search case-sentitive again"); } } ); // toggles search between case-insensitive (starting state) and case-sensitive whatPanel.setBackground(new Color(245, 245, 245)); whatPanel.add(targetNameLabel, BorderLayout.NORTH); whatPanel.add(targetNameSP, BorderLayout.CENTER); whatPanel.add(caseButton, BorderLayout.SOUTH); inputsPanel.add(wherePanel, BorderLayout.NORTH); inputsPanel.add(whatPanel, BorderLayout.SOUTH); findButton.addActionListener(actionEvent -> { // (the 'actionPerformed' method body of the implicit ActionListner...) String startText = startFolderTextArea.getText().trim(); Path startPath = null; // path to folder within which to (start) searching boolean startPathValid = true; try { startPath = Paths.get(startText); // ...for some start inputs on this call } catch (Exception e) // to handle possible InvalidPathException { startPathValid = false; } Path validatedStartPath = startPath; // (need an effectively final variable for use later in a lambda) if (startPathValid) // even if real path input, want to check that it a folder (cf a file), so reassign to... { startPathValid = Files.isDirectory(startPath); // (as empty string arg seems to generate Path regarded // as valid (root/current folder?), so for now including check re startText.isEmpty() below) } String targetText = targetNameTextArea.getText(); // (partial) names of files/folders for which to search final String target = caseSensitive ? targetText : targetText.toLowerCase(); // if (startText.isEmpty() || !startPathValid) { missingStartMessage = "No valid start path supplied" + "\n"; } if (target.isEmpty()) { missingTargetMessage = "No search term supplied" + "\n"; } if (!startText.isEmpty() && startPathValid && !target.isEmpty()) { // Only run process below if target text and valid start path have been supplied (avoid wasteful processing) // First, clear any previous message or search results and display a message to say the search is in progress displayTextArea.setForeground(Color.GREEN); displayTextArea.setFont(new Font("SERIF", Font.BOLD, 20)); displayTextArea.setText("Search in progress..."); // --v7 added for 'kill switch'...viable?---------------------------------------------------------------------------------------- // findButton.setText("Click again if you wish to exit (e.g. if search taking too long)"); // --problem: causes Find button to disappear until search completed!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! why???????????????????? // --and worse, most of the time the other buttons and boxes then disappear, but reappear if click aroud etc! // (no point putting in invokeLater below as latter's effects all appear only when all finished) // experimenting: changing to call from caseButton, caseButton text changes, but findButton still disappears!!!!!! // if (doesUserWantToExit == true) // { // doesUserWantToExit = false; // on first click, allow search to proceed // } // else // doesUserWantToExit is false, so muse be second click... // { // doesUserWantToExit = true; // ...and this value will allow System.exit(1) // // to be executed in the find(...) method below // } SwingUtilities.invokeLater(() -> // (want the above progress' message to appear first if search takes a while) { displayTextArea.setForeground(Color.BLACK); displayTextArea.setFont(null);// this seems to work to restore the font to default (as desired) displayTextArea.setText(null); // clear any previous text before displaying the results try { // long startSesrch = System.currentTimeMillis(); // --temporary // ...for testing - to start crude measurement of time to execute search // original approach... Stream targetStream = Files.find(validatedStartPath, walkDepth, (p,a) -> { // if (doesUserWantToExit) System.exit(1); // --v7 added for kill switch...viable?-------------------- return // return the boolean from... !isHiddenHandler(p) // to exclude hidden (temporary etc) files && ( toStringHandled(p.getFileName()).contains(target) // target in file/folder name... || (lookInside ? fileContainsTarget(p, target) : false) // or target within file text ); }); targetStream.forEach(p -> displayTextArea.append(p + "\n")); // print paths found in output/display area // 2nd approach... // Stream targetStream = Files.walk(validatedStartPath, walkDepth); // targetStream = targetStream.filter(p -> // { //// if (doesUserWantToExit) System.exit(1); // --v7 added for kill switch...viable?-------------------- // // return // return the boolean from... // !isHiddenHandler(p) // to exclude hidden (temporary etc) files // && ( toStringHandled(p.getFileName()).contains(target) // target in file/folder name... // || (lookInside ? fileContainsTarget(p, target) : false) // or target within file text // ); // }); // targetStream.forEach(p -> displayTextArea.append(p + "\n")); // print paths found in output/display area // 3rd approach... // Stream targetStream = Files.walk(validatedStartPath, walkDepth); // targetStream.filter(p -> // { //// if (doesUserWantToExit) System.exit(1); // --v7 added for kill switch...viable?-------------------- // // boolean hasTarget = // !isHiddenHandler(p) // to exclude hidden (temporary etc) files // && ( toStringHandled(p.getFileName()).contains(target) // target in file/folder name... // || (lookInside ? fileContainsTarget(p, target) : false) // or target within file text // ); // if (hasTarget) displayTextArea.append(p + "\n"); // print paths found in output/display area // return hasTarget; // }).forEach(p -> {}); // (does nothing, but need to call a terminal method to execute stream) // 4th approach - don't need a selected/filtered stream...just make decision whether to print path in forEach... // Stream targetStream = Files.walk(validatedStartPath, walkDepth); // targetStream.forEach(p -> // { //// if (doesUserWantToExit) System.exit(1); // --v7 added for kill switch...viable?-------------------- // // boolean hasTarget = // !isHiddenHandler(p) // to exclude hidden (temporary etc) files // && ( toStringHandled(p.getFileName()).contains(target) // target in file/folder name... // || (lookInside ? fileContainsTarget(p, target) : false) // or target within file text // ); // if (hasTarget) displayTextArea.append(p + "\n"); // print paths found in output/display area // }); // 5th approach...going back to Files.find(), but printing in the BiPredicate method // Stream targetStream = Files.find(validatedStartPath, walkDepth, (p,a) -> // { //// if (doesUserWantToExit) System.exit(1); // --v7 added for kill switch...viable?-------------------- // // boolean hasTarget = // !isHiddenHandler(p) // to exclude hidden (temporary etc) files // && ( toStringHandled(p.getFileName()).contains(target) // target in file/folder name... // || (lookInside ? fileContainsTarget(p, target) : false) // or target within file text // ); // if (hasTarget) displayTextArea.append(p + "\n"); // print paths found in output/display area // return hasTarget; // }); // targetStream.forEach(p -> {}); // (Note: Unfortunately, none of the approaches allow printing DURING the search // as invokeLater has everything happen when it finishes) // --temporary, for testing... // System.out.println("Search time in milliseconds approx': "); // System.out.println(System.currentTimeMillis() - startSesrch); // Without doing stats, observed no big difference in speed between approaches on my system for this test search // Saw first execution (after opening program) take longer (e.g. 2-4x) than subsequent // and sometimes VERY long (e.g. 3 mins; as have obsserved before; reason for trying to make kill switch) // perhaps corrosponding to first execution of a search with new target (/starting folder?)? if (displayTextArea.getText().equals("")) { displayTextArea.setForeground(Color.BLUE); displayTextArea.setText("No results found"); } // (seems inelegant, but have not thought of a way to do directly from the stream code yet) } catch (IOException ex) { if (ex.getClass().getName().equals("java.nio.file.NoSuchFileException")) { missingInputsMessage(); } // (probably not needed now, though, as should not get to try clause without valid start path) } // // --v7 added for the kill switch...is it viable?------------------------------------------------------------------------ // doesUserWantToExit = true; // reset for next search // findButton.setText("Find files/folders [...temporary change to text for troubleshooting code]"); // reset for next search } ); } else { missingInputsMessage(); } } ); findButton.setMargin(new Insets(10, 10, 10, 10)); findButton.setFont(new Font("SansSerif", Font.BOLD, 20)); findAndShowPanel.add(findButton, BorderLayout.NORTH); displayTextArea.setEditable(false); displayTextArea.setLineWrap(true); displayTextArea.setWrapStyleWord(true); displayTextArea.setMargin(new Insets(5, 5, 5, 5)); findAndShowPanel.add(new JScrollPane(displayTextArea), BorderLayout.SOUTH); frame.add(inputsPanel, BorderLayout.NORTH); frame.add(findAndShowPanel, BorderLayout.SOUTH); frame.setResizable(false); // (buttons disappear if user drags frame bottom up, // while if it's dragged down, the extra space just appears as a gap between the panels, // so just hard-coding big displayTextArea for the moment; // looking briefly online, see descriptions/code for how to make rezizable by dragging, but not trivial frame.setVisible(true); frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE); frame.pack(); // (may need to keep this positioned last) } void missingInputsMessage() // puts message in display area if one or both user inputs missing/incorrect { displayTextArea.setForeground(Color.red); displayTextArea.setText(missingStartMessage + missingTargetMessage); missingStartMessage = ""; // reset for future clicks missingTargetMessage = ""; // (ditto) } boolean isHiddenHandler (Path p) // as Files.isHidden(...) throws checked exception... { // ...easier to handle it externally than in stream coded in findButton's addActionListener above boolean result = false; try { result = Files.isHidden(p); } catch (IOException ex) { System.out.println(ex); } return result; } String toStringHandled (Path p) // as Path's toString() can throw NullPointerException... { // ...easier to handle it externally than in stream coded in findButton's addActionListener above String result = ""; try { result = p.toString(); // p is filename for start path, and is null if that is root, e.g. C:\ } catch (NullPointerException npEx) // ...in which case NullPointerException is thrown { displayTextArea.setText("Note: As you are starting from the root, " + "the search may terminate early if subfolders without access permission are encountered. " + "(This is a known issue, and may also affect searches starting further down the hierarchy, " + "in which case you will not see feedback, unfortunately)." + " May be addressed in a subsequent version of the program." + "\n\n"); } if (!caseSensitive) // if user has chosen case-sensitive option, this does not happen... { result = result.toLowerCase(); // ...search term made all-lowercase } return result; } // --v7 altered to delegate check to new method txtORcsvHasTrget(...) if file is txt or cvs... // (also removed un-needed targetFound variable) boolean fileContainsTarget (Path p, String target) { String fileName = (p.getFileName().toString().toLowerCase()); if (Files.isReadable(p)) // (have not found I actually need this isReadable check, but testing has been limited) { if ( ( fileName.endsWith(".txt") || // defining file types in which to search... fileName.endsWith(".csv") // ...and could add any other 'UTF text' types if there are any ) ) { return txtORcsvHasTrget(p, target); } else if (fileName.endsWith(".docx")) { return docxHasTarget(p, target); } else if (fileName.endsWith(".doc")) { return docHasTarget(p, target); } } // (or could parse the file extension as a string and use a switch statement instead) return false; } // --v7 added (mod from old fileContainsTarget(...) method)... boolean txtORcsvHasTrget (Path p, String target) { Stream linesFromFile = Stream.empty(); try { linesFromFile = Files.lines(p, StandardCharsets.ISO_8859_1); // Note for future reference: adding second arg StandardCharsets.ISO_8859_1 may avoid // MalformedInputException being thrown for non-UTF-encoded files, // e.g. docx, xlsx, pdf, which I am not including here as only 'gibberish' symbols are displayed of course if (!caseSensitive) { linesFromFile = linesFromFile.map(s -> s.toLowerCase()); // or could use String::toLowerCase I think } return linesFromFile.anyMatch(s -> s.contains(target)); // General note(s): As anyMatch(...) will return as soon as a match is found (if any) // it will not waste resources/time processing subsequent file lines. // Order in which lines is processed not important, // so might investigate later if making stream parallel improves speed } catch (IOException ex) { System.out.println(ex); } return false; } // --v7 added... boolean docxHasTarget(Path p, String target) { try { FileInputStream fIS = new FileInputStream(p.toFile()); // (Is there a more modern API I could use here with Apache POI?) XWPFDocument docx = new XWPFDocument(fIS); List paragraphList = docx.getParagraphs(); // (Would be nice to generate a stream rather than // read everyting into memory before processing, but there does not seem to be a methof for that in current Apace POI?) // May be a way to do for Excel files when I address them later? http://poi.apache.org/components/spreadsheet/limitations.html Stream paragraphStringStream = paragraphList.stream().map(para -> para.getText()); if (!caseSensitive) { paragraphStringStream = paragraphStringStream.map(s -> s.toLowerCase()); // or could use String::toLowerCase I think } return paragraphStringStream.anyMatch(s -> s.contains(target)); } catch (IOException ex) { System.out.println(ex); } catch (org.apache.poi.openxml4j.exceptions.NotOfficeXmlFileException ex) // See Note2 at bottom file { // System.out.println("org.apache.poi.openxml4j.exceptions.NotOfficeXmlFileException from docxHasTarget(...) method, " // + "processing file " + p + " ...ignore and continue."); // Keeping disabled unless needed for troubleshooting as these prints are visible to end-user in GUI // and can slow the search if many 'problem' .docx files are encountered } return false; } // --v7 added... boolean docHasTarget(Path p, String target) { try { FileInputStream fIS = new FileInputStream(p.toFile()); HWPFDocument doc = new HWPFDocument(fIS); WordExtractor extractor = new WordExtractor(doc); String[] paragraphStrings = extractor.getParagraphText(); // (getText() would extract all text as single String) Stream paragraphStringStream = Arrays.stream(paragraphStrings); if (!caseSensitive) { paragraphStringStream = paragraphStringStream.map(s -> s.toLowerCase()); // or could use String::toLowerCase I think } return paragraphStringStream.anyMatch(s -> s.contains(target)); } catch (IOException ex) { System.out.println(ex); } catch (IllegalArgumentException ex) // See Note1 at bottom file { // System.out.println("IllegalArgumentException from docHasTarget(...) method ," // + "processing file " + p + ", probably because" // + " a non-doc file with a mis-applied doc extension was encountered...ignore and continue."); // Keeping disabled unless needed for troubleshooting as these prints are visible to end-user in GUI // and can slow the search if many 'problem' .doc files are encountered } return false; } public static void main(String[] args) { new FindFileOrFolder_v7_i2(); } } /* Items that might be added/addressed later: --Possible to allow a search to be aborted (while displaying items found up to that point) without closing the program window (& have the button text toggle to a message indicating this option)? Maybe investigate use of SwingWorker. Known issue: It is not even possible to abort the program by clicking to close the GUI window, so might be worth trying a button or toggle to exit the programas a lower-tech work-around. --Maybe try to add an option to include text from within at least some text-encoding file types in search. --Maybe try to address known issue that searches with search-in-subfolders enabled from some start paths near the root, e.g. C:\Users, and even C:\Users\[my user name]\Documents on my system, terminate prematurely. (However, though the user does not receive feedback and may not realise that not all files which should be found are, the program does not crash and will respond normally to a 'regular' subsequent search.) Cause = AccessDeniedException thrown due to denial of access to some folders. Might not be able to handle this from Files.find(...), as used currently, or Files.walk(...), so perhaps try to instead use walkFileTree + FileVisitor. --User settings provided by the upper buttons (all but the Find button) could be provided by dropdowns or radio buttons or checkboxes (+see * below) instead, to make current settings more immediately obvious. --Maybe address any notes left in comments re possible reconfigurations --Test whether making any of the Streams parallel (where possible, i.e. without loss of any beneficial out put order etc) improves speed (on my system, at least); especially relevant if looking within files --Maybe try to address known issue that searches with search-within-files enabled may be slowed if 'inauthentic' files having .doc(/.docx) extensions are encountered. See comments in Note1 and Note2 below. Way to determine actual file type rather than just parsing extension before sending to Apache POI code? Perhaps use walkFileTree + FileVisitor instead of Files.find(...) to skip _vti_cnf folders? *Check boxes might be especially useful to allow allow user choose serch-within only some of the supported file types --Known issue: Sometimes a given search takes ~10x or more longer than it does if executed subsequently once or repeatedly (on my system). Also, the GUI may take a long time to open sometimes. Try to find out why and address. */ /* Other notes: --Note1: Searching in one of my large folders (note to self - “Archives”) with search-in-subfolders & within-files enabled, got “java.lang.IllegalArgumentException: The document is really a UNKNOWN file” ...coming from HWPFDocument via invoction of my docHasTarget(...) method. Looking online,this may be applicanble: https://stackoverflow.com/questions/4996954/error-in-displaying-a-doc-file-reading-that-from-a-document-on-console-in-java comments “That's a typical message of an IllegalArgumentException from the HWPFDocument constructor. To the point it means that the supplied file is actually a (Wordpad) RTF file whose .rtf extension has incorrectly been renamed to .doc.” On handling the exception, I note that many of my 'problem' files have the .docx extension and open with MS word but are within _vti_cnf folders that I inadvertantly made in my archive folders years ago, and/or were transferred from an Apple Mac years ago, sometimes involving RESOURCE.FRK entity. Are some files wrongly flagged as not .doc? How much are things slowed up? --Note2: I assume the org.apache.poi.openxml4j.exceptions.NotOfficeXmlFileException that I see thrown (albeit much less frequently from the files in the folder mentioned in Note1) from docxHasTarget(...) is somewhat analogous to that of Note1. Also from _vti_cnf folders, as per Note1 above. */