Sunday, April 3, 2011

Overhead of File/Directory.Exists in getter?

I have a class that has several properties that refer to file/directory locations on the local disk. These values can be dynamic and i want to ensure that anytime they are accessed, i verify that it exists first without having to include this code in every method that uses the values.

My question is, does putting this in the getter incur a performance penalty? It would not be called thousands of times in a loop, so that is not a consideration. Just want to make sure i am not doing something that would cause unnecessary bottle necks.

I know that generally it is not wise to optimize too early, but i would rather have this error checking in place now before i have to go back and remove it from the getter and add it all over the place.


Clarification:

The files/directories being pointed to by the properties are going to be used by System.Diagnostics.Process. i won't be reading/writing to these files/directories directly, i just want to make sure they exist before i spawn a child process.

From stackoverflow
  • If you are reusing an object you should consider using the FileInfo class vs the static File class. The static methods of the File class does a possible unnecessary security check each time.
    FileInfo - DirectoryInfo - File - Directory

    EDIT:

    My answer would still apply. To make sure your file exists you would do something like so in your getter:

    if(File.Exists(string))
    //do stuff
    else
    //file doesn't exist
    

    OR

    FileInfo fi = new FileInfo(fName);
     if (fi.Exists)
    //do stuff
    else
    //file doesn't exist
    

    Correct?

    What I am saying is that if your are looping through this logic thousands of time then use the FileInfo instance VS the static File class because you will get a negative performance impact if you use the static File.Exits method.

    Jason Miesionczek : Please see my clarification above.
  • If you're that worried about performance (and you're right when you say that it's not a good idea to optimize too early), there are ways to mitigate this. If you consider that the expensive operation is the File I/O and you have lots of these going on, you could always look at using something like a Dictionary in your class. Consider this (fairly contrived) sample code:

    private Dictionary<string, bool> _directories = new Dictionary<string, bool>();
    
    private void CheckDirectory(string directory, bool create)
    {
      if (_directories.ContainsKey(_directories))
      {
        bool exists = Directory.Exists(directory);
        if (create && !exists)
        {
          Directory.CreateDirectory(directory);
        }
        // Add the directory to the dictionary. The value depends on
        // whether the directory previously existed or the method has been told
        // to create it.
        _directories.Add(directory, create || exists);
      }
    }
    

    It's a simple matter later on to add those directories that don't exist by iterating over this list.

    1. It is feasible for the path to exist at the point it is check but be moved/deleted in between that and the operation on it.

      • you may already know this and accept the risk but just so you are aware of it.
    2. If you are going to do it anyway it doesn't matter whether it's in a property or not, just what granularity of checking you do (once per operation or once per group of operations)

    3. If you use the non static FileInfo operations be aware that this object will cache its view on the file system.

      • This could be a good thing for you as you can control how often the cache is refreshed via the Refresh() method or it may lead to possible bugs in your code.
    4. The usual try it first before worrying about performance recommendation applies but you indicate you are aware of this.

  • Anything that's not a simple lookup or computation should go in a method, not a property. Properties should be conceptually similar to just accessing a field - if there is any additional overhead or chance of failure (and IO - even just checking a file exists - would fail that test on both counts) then properties are not the right choice.

    Remember that properties even get called by the debugger when looking at object state.

    Your question about what the overhead actually is, and optimising early, becomes irrelevant when looked at from this perspective. Hope this helps.

0 comments:

Post a Comment