Page 1 of 1

LF experienced people to review my archiving system 4 school

Posted: 2016/03/24 13:03:33
by genezing

I see this is a bit of an off topic thread, so I hope this is the good place to ask my questions.

I am 2nd year student at University in Ghent (Belgium) and for our project we need to set up an archiving system to archive student work, java projects, .net projects, tests, exams and so on.
The purpose of this is that our school can put all these files on a server to store it for a long time and access it quickly. Students can access this platform to see their work and put files on it, teachers can access all
the files and execute java projects and .net via this platform.
You can access the platform with school credentials.

So I am going to work with CentOS servers and a Windows server for the .net applications (although I've seen you can use mono I don't think it's that stable). For version control and updates I would use scripts via crontab,
to automatize it all I would also use scripts for the CentOS servers with puppet. All files are send via ssh between the servers.

We need to make a physical system, no cloud.

this is my plan:

from a host pc you log in, an sql server is connected and will check your CRUD rights. Once logged in you will see your things or as a teacher all the static files (ordered of course) with the option to execute projects. To store files you can do
it 2 ways, first is via sftp put it on the central file server, or via a git server (this server will clone the project to the file server). The file server will send the static files to the lamp server (the lamp server is the same as the page where you log in). projects (.net java) will be stored on the file server and will stay there. If a teacher wants to execute a .net application, the lamp server sends a trigger to the build server, this build server builds the application from the file server to the .net server or java server and shows it.


So now, is this a good concept of the problem? I need advice on this, thank you for reading!

Re: LF experienced people to review my archiving system 4 sc

Posted: 2016/03/24 14:08:22
by MartinR
Before designing your archiving system I think you need to clarify a few points:
  • What do you consider a long time?
    How much data are you talking about?
    Who has ownership of the data, particularly for students and teachers who have left?
    How quickly is quickly?
    How is the media maintained?
If by long you are meaning the duration of your course, say 2 years, you will get a very different system from an archive designed to be useful in 10, 20 or 50 years. Likewise if you are storing a few GB, then you also get a different result from considering terabyte or even petabyte storage. If the data comes under the purview of data protection legislation or company confidentiality/IP then you need to show some control on access in the future. Quickly can mean a few hours as against the overnight courier, or near-line (robotic tape storage) or straight off disk. For small datasets on disk conventional backup may suffice, but for near- and off- line storage it can become an issue.

I would seriously recommend splitting the project into a number of quite distinct layers:
  • An archive system to move data into and out of long term storage
    Possibly a GUI to send commands to the above
    A cross-platform migration strategy: Windows <-> Linux and iOS <-> Linux come to mind.
    Maybe put the GUI in here?
    Build systems - these are usually OS/version dependant.
If you design a complex system which is dependant upon multiple OSs and packages you will be forever having to redesign and reintegrate. Define your layers clearly and you only have to work on a module.

Be especially wary of binary files including word processors, spreadsheets and databases. Are you certain that you will be able to read them in 5 years time? Critical text needs to be stored as ASCII (ok, possibly ISO Latin 1), XML and its friends make that easier today. Include a simple manifest.

A few more points:
  • If you trust the PC login, cannot you use its credentials to give a single login? If you don't trust the PC you've got bigger problems.
    Keep execution and archiving separate.
    Why give two options? Select one and enforce it. FTP has a longer and more stable history that GIT, but GIT is not going away any time soon. FTP is trivial to drive, GIT less so.
    Define clearly what a project is. Consider: should an archive be a totally stand-alone snapshot or else an incremental change. The latter saves space assuming all previous versions are available. Now consider this from a 5 or 10 year perspective.
    There is nothing wrong with an upper layer triggering requests down to lower layers. Establishing a good, modular stanck will facilitate this.
I'm sorry if this all seems a bit critical. Speaking from experience trying to retrieve decades old data can be a nightmare, forethought and planning for changing standards can ease the way. Probably the best advice is KISS (Keep It Simple) :)

Re: LF experienced people to review my archiving system 4 sc

Posted: 2016/03/24 21:08:22
by genezing
Your response is not critical at all, I need this kind of replies, thank you for replying! :)

I need to keep it simple that's true, but it needs to be strong, and solid network/platform.

I forgot to mention some points;

All the files needs to be stored +-10years
data each year will be around 500gb
quickly, very quick, almost instant file access for static files, for applications this will depend on the application it self.
It should be a self maintaining system.

Re: LF experienced people to review my archiving system 4 sc

Posted: 2016/03/25 08:15:20
by giulix63
What you are talking about here is basically a CMS (drupal7 comes to mind, since it's in EPEL) with the ability to execute web apps. I am not sure Drupal can do that out of the box, but you could implement your apps as Drupal modules. There's also much talk these days about M$ going to bed with RH. So, you could jump on the wagon and run Java alongside .NET apps on Openshift Origin which is -I believe- the local, non-cloud version, along with your CMS.

P.S. This was compiled with the help of a colleague sitting next to me who is a JEE Architect with a sweet tooth for OSS.

Re: LF experienced people to review my archiving system 4 sc

Posted: 2016/03/29 10:52:48
by genezing
I looked into Openshift, am I right if I'm saying that the full version is not free? and that running .net apps is really hard and not stable yet?


Re: LF experienced people to review my archiving system 4 sc

Posted: 2016/03/29 12:23:28
by giulix63
I have no direct experience of running Openshift but, as usual with RH products, I have been told that the community edition is free (no support from RH, of course). As for .NET stability, I really cannot say... As with all new technologies, I suppose some quirks are expected and compensated by long-term competitive edge.

Re: LF experienced people to review my archiving system 4 sc

Posted: 2016/03/30 13:45:17
by genezing
thank you for your reply I will look deeper into it.

But I'm stuck with a problem, I've managed to create a front end server that will allow you to upload files (they are stored in /uploads, the thing is I have no clue how I should manage to transfer these files to the file server and delete them on the front end server, I can do this via scripts and call them with php but I don't think that's the way to go...

Or should I let the file server do this job, and let the file server check for new files to sync? so It will keep checking the front end server for new files and delete them afterwards on the front end server.