The Problem with Research Software Engineering

Leios Labs

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 14 чер 2024
A discussion about how to make research software engineering a bit better!
Github sponsors (Patreon for code): github.com/sponsors/leios
Twitch: / leioslabs
Discord: / discord
Github: github.com/leios
Bibliography
[1] joss.theoj.org/papers/10.2110...
[2] journals.aps.org/prfluids/abs...
Наука та технологія

КОМЕНТАРІ • 139

@LeiosLabs 3 роки тому ⁺³⁷
A bit of a different video today about something that's been on my mind. I know it's a bit of a rant and more or less a clip from my livestream, but I thought some people might benefit from it! Let me know if you like this type of content as well. If so, I am happy to do more "lecture-style" videos on various topics.
@LeiosLabs 3 роки тому
What do you mean, exactly? Like a better code review process to go along with the peer review?
@PerpetualHope 3 роки тому
Have you tried putting this in the science twitter sphere? Could generate much-needed discussion there among researchers
@Crushnaut 3 роки тому
TIL some people's rants have powerpoints
@gmt-yt 3 роки тому ⁺¹
Hell, yes! Also media. The problem is particularly noticeable in libraries where one project's "issues" tend to infect other, "innocent" projects, who thought they would just be consuming an "API" and instead ingested a can of worms. Sci/media projects tend to be rife with platform and abandon-ware dependencies, idiosyncratic build frameworks, undocumented behavior, hard-coded constants, nonstandard object persistence, premature optimization, low-level programming abuse, non-adherence to coding standards, and any other offense to maintainability and code-safety imaginable. It's not your imagination.
@foobars3816 3 роки тому
Often the code I see coming from people with a scientific background also tends to use a lot of single letter variable names instead of descriptive ones that could aid in the understanding of the code. Has this been you experience as I was surprised it wasn't mentioned?
@bartzijlstra3193 3 роки тому ⁺⁴⁵
Having recently started as a "real" software engineer after finishing my PhD, I recognize many of these problems. We did do version control and unit-testing for our research software, but I often passed up on good software documentation in favor of writing the actual research articles. I've also had many requests from colleagues to share my code for making high-quality graphs. Most of the time I had to reply with: "You can have my code, but it won't work directly on any other data than mine. Please take my code as-is, and use it as an example to try writing something of your own." I know I could have made my graphing tools much more modular and general, but at the end of the day I needed to have my thesis finished.
@LeiosLabs 3 роки тому ⁺⁸
Honestly, using version control and doing proper testing is still pretty good! In my opinion, software is only really as useful as its documentation, but if the code was meant as a script for a single publication and not meant for reuse, I think it's acceptable to have less documentation... As long as people can still understand the code enough to replicate the results!
@rentristandelacruz 3 роки тому ⁺⁴¹
I worked as a research assistant in a chemistry laboratory that primarily deals with simulation. The lab head is still using a FORTRAN for nucleation simulation. I believe that code is at least 20 years old. When I tried to read the code it has variables like 'xxx' and 'yyy'.
@apurbabiswas7218 3 роки тому ⁺⁹
I'm worried about working with an old code base myself. That seems dreadful
@LeiosLabs 3 роки тому ⁺⁹
Haha. I've definitely had a similar experience! I actually like fortran, but bad code is bad code.
@altaroffire56 3 роки тому ⁺¹²
In my experience, Fortran is very convenient for that kind of work. (Fast, low-level, clean syntax, built-in support for matrices and complex numbers... it feels like the right tool for the job. With C, on the other hand, it often feels like working against the language to get it to do what I want.)
A determined scientist can write unreadable code in any language, though.
@LeiosLabs 3 роки тому ⁺¹¹
@@altaroffire56 A lot of people say fortran is bad, but I 100% agree. Usually fortran is seen as a bad programming language only because people programming it use bad programming practices.
@coryrobertson6367 3 роки тому ⁺⁵
@@LeiosLabs Lets not forget that what is now bad programming practices may have been the best at the time. Short identifiers due to length restrictions, terse programs due to small screen sizes, few inline comments due to file limitations, etc. It is the improvements in technology that have allowed us to write more human readable code.
@AngryArmadillo 3 роки тому ⁺⁵⁵
As someone who has worked in both pure software development and pure CS research positions, I completely agree. Specially when it comes to documentation and peer review of code, I’m shocked by a lack of standardization. Asking a researcher for access to their code is a true roll of the dice.
@LeiosLabs 3 роки тому ⁺¹⁰
This has been my experience as well, but I definitely started on the academic side. The moment you leave the academic bubble, you start to realize how poor the software standards actually are in academia.
@zebulon220 3 роки тому ⁺²⁸
Congrats on your phd, I completely agree with everything you say in this video.
@LeiosLabs 3 роки тому ⁺³
Thanks! I'm hoping to start a discussion and maybe get researchers to think a bit more about their code.
@Axman6 3 роки тому ⁺¹⁵
My experience with researchers writing code was that the piece of software they needed most was git. So much version-control-via-making-copies-and-emailing-it-to-yourself.
@LeiosLabs 3 роки тому ⁺⁷
Who needs version control when you have dropbox?
@Axman6 3 роки тому ⁺⁶
LeiosOS stop it, please 😭
@Pa_Nic 3 роки тому ⁺¹²
Competitive programmers may be able to help. You can get relatively clean and simple code from very complex new algorithms if you ask competitive programmers. We are trained to code common algorithms really quickly and occasionally search for better (faster, more memory efficient, working online, etc.) algorithms to implement so we can use them as "secret weapons" during contests.
As an example: given a tree graph of N nodes, it is widely known that you can find its centroid decomposition in O(N log N) time. However, a quick Google search will lead you to a paper demonstrating O(N) centroid decomposition which has no code. To verify, we usually just read the paper, code the algorithm ourselves, and stress test it against the verified slower algorithm with thousands of randomly generated cases.
Might it be possible for researchers to get competitive programmers verify their work?
@LeiosLabs 3 роки тому ⁺⁶
I don't think that would be a bad idea. I think the best bet is to teach researchers to think more like competitive programmers in this case.
@fa-pm5dr 3 роки тому ⁺³
@@LeiosLabs competitive programming has one of the steepest learning curves i have seen
@milobem4458 3 роки тому
How do you get them to work together? Academia has a very traditional structure. Software engineers are sometimes hired as "lab assistants", which means everyone ignores them until last minute when some bug shows up in a big mess of unreadable code. If they get accepted into academic position, like PhD student, or post-doc, they are pressured to publish their own work asap or get lost.
@apurbabiswas7218 3 роки тому ⁺²⁵
This was very helpful. I'm going to look more into JOSS.
As a Physicist interested in Scientific computing, unit testing seems like almost a foreign concept, and I feel fairly inadequate compared to my computer science peers.
I've had enough exposure to the importance of version control prompting me to learn git myself. For anyone else in a similar position, look at the MIT Missing Semester Jan2020 IAP for similar computer sciencey-"filler" education.
More videos about CliMA would be cool : )
@LeiosLabs 3 роки тому ⁺⁵
I was in your exact position at the start of my PhD. I knew about unit testing, but never "needed" it for my code and used version control, but couldn't really get my peers to use it, so I was stuck. It was an uphill battle for me, but learning proper programming practices helped out my research tremendously!
@apurbabiswas7218 3 роки тому ⁺²
@@LeiosLabs thanks for the tip. I feel fairly lucky in this regard as there's so much to learn from online, that hopefully I'll have it easier. Content like yours helps so thank you once again
@brandonnelson8781 3 роки тому ⁺³
Thank you for posting this. Going through my PhD now, I experience many of these pains that you've clearly outlined here. If we could continue to grow this discussion and build a scientific community more embracing of software engineering practices, starting with git and code re-usability, the long term gains would certainly outpace the short term learning pains.
@AaronPM55 3 роки тому ⁺²
I work in the DSP field and we work closely with people in academia. I 100% agree with what you say. So much time could have been saved if the code handed to us was written better or even followed the paper.
I think a big thing is that some older people in academia have the attitude of "if you used simulations, you didn't solve the problem." I personally think it's weird to see people not use software as a tool for verification on both generated and real data.
@lw4423 3 роки тому ⁺⁴
I wish someone would make a tutorial where the student can follow along and learn to make a Julia package that does something trivial but the point is learning to make a package and putting in on GitHub plus all the documentation and tests and making branches and all that.
@tallon3925 3 роки тому ⁺¹
found you through OIST's youtube channel, love your videos! thanks for sharing your passions
@LeiosLabs 3 роки тому
Oh, cool! Happy to see you are checking out OIST's content! They've really been doing their best to put out cool, compelling content recently!
@youtubereview8176 Рік тому
Thank you so much for posting this video. What I've heard for a lot about algorithms is that when a paper is written, and it says that it has great performance, it's very likely that the implementation will be very costly and won't have better performance than the current solution. OFC, there are also some breakthroughs.
@alijassim7015 3 роки тому ⁺³
This video is so spot on! Publishers need to see this.
@LeiosLabs 3 роки тому
I was hoping to start a discussion
@IamLupo 3 роки тому ⁺¹
It not a rant my friend. You made a valid point! Keep doing what your doing buddy!
@ProjectPhysX 3 роки тому ⁺³
Congrats to your PhD!
Thank you for your perspective on research software engineering. I have never seen course offers at my university for scientists on how to write good software and in the end it comes down to teaching yourself.
I work in the same field (PhD candidate in computational fluid dynamics with LBM / physics) and I've seen lots of bad code as well, due to the points you've discussed. But that's not always the case.
The incentive to write clean code is given at least once you work on software as a team. We do refactoring and on a regular basis and make sure every line is properly documented.
Because of the teamwork, version control becomes a necessity as well.
Testing code is actually most of the work. If code is not testable and the results are not reproducible, it is trash code, no matter in which field.
The main incentive for our software project actually was hardware (GPU) efficiency and performance. No other software on the market is capable of comparable performance, so we had to write our own.
Regarding job chances, research software engineering is not a dead end at all. If you really master scientific programming, you don't have to apply for a job because companies will apply for your time.
@LeiosLabs 3 роки тому
I think we have almost the same perspective here. I'm happy to see more people writing the best code they can given the circumstances!
As a note: I've been considering doing a video on heterogeneous computation (CPU / GPU) in Julia for HPC relatively soon.
@milobem4458 3 роки тому ⁺¹
I don't know about your team, but many scientists seem to think "testing" means running the whole 40 hours experiment and comparing the results with the last run. When software engineers talk about testing, they mean small and readable unit tests which take seconds to run and verify automatically. But again, we can't blame self taught programmers of not knowing all the best practices.
@rifatahamed7052 3 роки тому ⁺¹
This is an essential topic for research. More incentive should be given towards research software development. Many of the high quality research depend on how well a simulation or model has been formulated and executed. Better programming practices in developing research works will lead towards better research scopes.
@felixrichter1100 3 роки тому ⁺²
As a master student working in a research group I could not agree more with all the things you just said.
@LeiosLabs 3 роки тому
Happy to hear I'm not alone!
@gavinpeng1976 3 роки тому ⁺⁴
Right on point. I am currently trying to refractor an old academia codebase consisting of Matlab, Python, Java, and C++ that are glued together using Matlab, and it is just a nightmare. And yes, Matlab is evil - you often see thousands of lines of code without encapsulation and a huge namespace. I genuinely think that much more people would have used the code if it was written in a more professional fashion.
@LeiosLabs 3 роки тому ⁺¹
That sounds awful! I have had similar experiences, no where near *that* bad. On the other hand, at least they are giving you time to refactor!
I really feel we need to be honest about the fact that software is how people conduct research. Poor documentation / software practices are precisely the same as keeping a bad lab notebook / sloppy methodologies.
@HatersGonnaHate4 Рік тому
This video has really made clear some issues I've noticed at my current (research focussed) job and it's very satisfying to hear it stated succinctly
@gz6616 3 роки тому ⁺⁴
I recently came across JOSS and made a submission to it. In so doing I found that there are lots of thing I didn't know, including writing tests, documentation using sphnix, proper packaging of the code. At least I knew a bit of git, which I learned during my spare time, when making side projects totally unrelated to the research projects. These things we have to learn by ourselves, the institution does not provide such trainings, and many of my colleagues don't care about these things at all.
And I also dislike matlab, array doesn't even start from 0.
@LeiosLabs 3 роки тому ⁺¹
This is my experience as well. I like JOSS because it brings up a lot of topics academics typically ignore.
@shoam2103 3 роки тому
Arrays starting from 0 isn't a problem. R, Julia, Lua, etc all start with 1. In functional / array programming languages it doesn't matter. You'd be basically writing the same code in Haskell for e.g. (a 0-indexed functional language). Function composition instead of loops always!
@SoopaPop 3 роки тому ⁺⁵
Thank you for this video. I'm a 3rd year doctoral student in Applied Math, and specifically the scientific computer subdisciplines you mention. I'm currently finalizing a moderate size (about 4000 lines of C) codebase to be open sourced along with a paper submission. There a serious crunch-time feeling which is causing various holes in documentation as well as crappy inefficient fixes. You're definitely right, writing well documented code feels impossible when one is also supposed to also be pushing out theoretical breakthroughs of some flavor.
On the other hand, it is also very hard to write code that works without a strong grounding in the theory of a subject.
@LeiosLabs 3 роки тому
Right, exactly! You need domain knowledge and software knowledge. It's not an easy job at all! Good luck with the paper!
@loutsauv5079 3 роки тому ⁺³
Thank you ! That's exactly why I did not went into research :(. I was shocked that in operational research, code was not standardized, shared nor reviewed ! People are publishing results (sometimes modified or cherry-picked, impossible (or long and hard) to verifiy.
Please we need more people to pay attention on this problem which in my opinion slows down research and hinder its credibility !
It's also true in economics, especially with the infamous case of a research paper wrong and badly reviewed with a badly written excel file driving most of the modern politics regarding public debt based on false assertions :( !
@LeiosLabs 3 роки тому
I definitely feel your pain! It's sometimes incredible how poor research code can be!
@KevinHorecka 3 роки тому ⁺¹
This video speaks to me so much. I was a software engineer/systems engineer before going back to grad school, and I was the only computational-focused person in my lab for Neuroscience. There were other folks who knew how to program (and some who couldn't do more than a stats script), but writing "good" code (as loaded as that is) was just not a priority because no one else was ever going to see it (because there was no avenue to share and no one wants to replicate results anyway).
Lo and behold, my code ends up being pretty useful for some other work (related to TBI), and it is fortunately very documented so I was able to share it. It's far from perfect, and finding the balance of where to stop on it because it was good enough was a huge challenge. I would've loved to submit it to a journal and get it more polished, but there was no value in that (at least relative to the other priorities I had to graduate).
I wish I knew how to help push the culture forward in this space. I left academia after graduating, so I'm afraid I'm not being very helpful. I've started publishing again recently around my volunteer work, so maybe that's my avenue to help.
@LeiosLabs 3 роки тому ⁺¹
I'm glad you are still thinking about helping out! I think creating well-documented code is already a good step forward. If people can use your code easier, then they will start to see the value in good programming practices.
@KevinHorecka 3 роки тому ⁺¹
@@LeiosLabs Thanks! Keep up the great work! Your videos are always a joy to watch.
@yas-xk5wf 3 роки тому ⁺²
Excellent video. Empathise with the points you've made. They're not only relevant in academia but also within commerical settings where there is pressure to release and not enough resource is committed towards building robust systems. Or, conversely, systems are engineered well but the solutions aren't scientifically rigorous.
@LeiosLabs 3 роки тому ⁺²
I don't often see production code, but don't doubt there are similar issues there! In general, we need to do better to write high-quality software whenever possible!
@monikaparmar2061 3 роки тому ⁺²
Great video. Thanks!
@LeiosLabs 3 роки тому ⁺¹
Glad you liked it!
@SapereAude1490 Рік тому
Watching this video while working on a Matlab AppDesigner web app for a paper. PhD in chemistry. Everything you say is true.
I've been watching python tutorial lately - I hope to escape this landscape into a proper developer, because I know full well this is not the proper way to do programming.
@TheMazyProduction 3 роки тому ⁺³
CONGRATS 🎉🎈🍾 on the PhD, well deserved.
@LeiosLabs 3 роки тому
Thanks a bunch!
@MarcelRobitaille 3 роки тому
I totally agree! So many more things need to be under version control.
@BradenEliason 3 роки тому ⁺⁴
I think there's a lot of researchers that see Matlab as a necessary evil. There's just such a developed ecosystem of tools and labs are reluctant to migrate.
I would like to see some sort of bounty system for migrating Matlab code to Julia. In ultrasound physics there's packages like Field II and FOCUS for Matlab which I would like to see migrated and I'd be happy to chip in to some fund to make that happen.
@LeiosLabs 3 роки тому ⁺¹
Yeah, matlab for experimental work is one thing. If there are no other packages that allow users to connect with their experiments, then it's the only option.
I would love to see Julia take over the role of Matlab, though. Have you gone on the Julia Slack or Discourse to see if there is development in the areas you need for your research?
@BradenEliason 3 роки тому ⁺²
@@LeiosLabs I've looked but nothing has anywhere close to feature parity with the Matlab packages I referenced. To dethrown Matlab you need to get a critical mass of labs to switch over and that's not going to happen quickly if necessary packages are missing. Unfortunately, the same labs that are reluctant to switch are the ones that are best suited to migrate the libraries to Julia. It's a bit of chicken and egg problem. What do you think of a bounty system to reward open source research programmers?
@LeiosLabs 3 роки тому ⁺¹
@@BradenEliason I wish the bounty system would work! If there was funding for that, I would love to take a crack at it.
@porschepanamera92 3 роки тому ⁺³
And you're probably only talking about software engineering or at least from that perspective.
However, as a PhD student in the field of structural engineering, I wasn't specifically trained to write code, but some/many problems can't be solved without a bit of coding. Now, imagine that horrible google-stackoverflow-slapped-together-frankenstein code. But most of the time, if it works it works and I'm more than happy I made something that actually does what it should. And indeed, usually, once the publication is done, the project is set aside, as well as the code. Fully agree though!
@LeiosLabs 3 роки тому ⁺¹
I tried to speak both from the academic perspective (where there is no incentive to write clean code) and the software engineering perspective (where there is not incentive for software engineers to stay in academia). I also left out some content that was particularly ranty about academia. I want to start a conversation, not an argument.
@porschepanamera92 3 роки тому
@@LeiosLabs I hear you and I wanted to acknowledge this from my own experience in academia. I'm curious how it would evolve over time, as programming will become increasingly more important (I think).
@soheilsolhjoo 3 роки тому ⁺¹
Hello James, congrats on your PhD, and thanks for bringing up this topic. As a computational scientist, I needed to write a lot of different codes from my bachelor's project till my postdoc, yet only very recently I learned about git. Such a shame!
Regarding publishing codes and make them open access, I totally agree with that: the very least advantage will be a required documentation, besides the code itself.
However, I'm not really sure whether reviewing the codes can be a good idea. Imagine for a particular work, you need to write in different languages, e.g. bash codes for Linux, Matlab and Fortran, which doesn't happen all the time, but it's possible; expecting the reviewers to be familiar with all of these languages doesn't seem that realistic to me. Moreover, if a code is supposed to model a phenomenon, shouldn't its results to be the main concern in the review process?
Perhaps it would help if the publishing houses make it mandatory to publish the accompanied codes (+a proper documentation) of each paper, even without a proper review, and let the readers/users decide for themselves.
@LeiosLabs 3 роки тому ⁺¹
This is a great perspective and good discussion!
I still say that we need some sort of review. If a reviewer doesn't know fortran (or the language the code is in) and is reviewing for a journal, that's totally fine, but *someone* should know fortran in the review process. A good solution would be to link most paper reviews to JOSS, so people can review the software independent of the scientific review.
Part of the current problem is due to the fact that people don't know the languages / methods used in the field, but still review for that field. I would argue that if they can't read the code, they shouldn't review the paper. Obviously, this is unrealistic in practice, so for a good first step, we should at least publish code with the paper (like you said).
@MarcelloZucchi91 3 роки тому
The issue is not about the correctness of the code, rather the reproducibility of the results which is (was maybe) a pillar of the scientific process. You can always describe your algorithms with pseudocode, but that's not feasible for large projects. Moreover, people in academia usually want to keep their software for themselves, so as to retain an advantage above potential competitors. That's why services like CodeOcean are gaining popularity. They effectively provide an interface "shield" between your code and the outside world, allowing people to experiment with it as a black box (very important for scripting and interpreted languages).
@milos_radovanovic 3 роки тому ⁺²
Have you had any GUI related problems with academically developed, commercial research software?
I find that research software GUI tends to follow similar anti-design-patterns to one's encountered in other topic-specific software fields like music composing as showcased by Tantacrul on his channel in his "Music Software & Interface Design" series.
I watch his "diatribe" videos because I find it healing after long hours working with research software GUI.
@LeiosLabs 3 роки тому ⁺¹
I don't typically come across research software with GUIs. Most of the stuff I do is CLI-only.
The one time I did use a GUI, it was a bit of a mess.
@thej680 Рік тому ⁺¹
Hello! I found your video through hearing of "research software engineers," and I am very curious. I understand your present concerns with your current career. I am looking for advice.
I received a bachelor's degree with math and cs. I am personally very interested in getting into the HPC background and would like some recommendations because I am not sure where to even begin. I am considering going back to school too, but I am also not sure about funding and such. What would you recommend to do if you were me?
Also, I would like to try out my own pet project to get some introduction to the subject. I've heard of OpenMPI and some other things for parallel computing. If you could suggest a beginner level project, what would you recommend?
@LeiosLabs Рік тому ⁺¹
So there are a bunch of different "ways to start" with RSE. The best way is probably to just e-mail people, state your background, and ask if they need some programming help. You will learn the tools along the way.
As for a pet project, it's kinda hard to say. There's a difference between parallelism and *distributed* parallelism that you need for HPC. Going from parallel to distributed is not an easy step, but can only really be done if you have a cluster available to mess around with. You can still learn the tools necessary, though (MPI, mainly, but also CUDA for GPUs).
I might recommend looking into the Julia ecosystem at this time as I know people are looking for help with their distributed setup and having your name associated with those tools will probably help out if people are looking for Julia positions. That said, most of the RSE code is either in C(++) or Fortran, so knowing those languages might be a bit more useful.
@thej680 Рік тому
@@LeiosLabs Thank you for the prompt response! I can definitely have a look into those languages and technologies you mentioned.
When you say e-mailing people, do you mean university professors or people via LinkedIn? I imagine professors. One issue I came across was one professor couldn't take my assistance unless I was a student at their university. Can some professors be flexible regarding that?
@LeiosLabs Рік тому ⁺¹
@@thej680 Not necessarily university professors. People who are writing papers or software you are interested in. Most of the code for RSE is open source and you can probably start collaborating on github pretty quickly.
@abhinavadarshsood5759 3 роки тому ⁺²
As a high school graduate, is it worthwhile economically to get into research in fields related to Machine Learning, Computer Science, and Software Engineering if very interested in the fields?
@LeiosLabs 3 роки тому ⁺²
Yeah, definitely! If you want to go down the pure research route, there is funding and interesting research opportunities. If you want to do these fields in industry, there is also plenty of funding available.
My point here is just that software engineering is not well integrated into research in all fields. It's getting better and it's a good idea to ride the wave now while people are starting to see the need of better integration.
@danielchin1259 3 роки тому
Thank you for your proposal.
@NicosLeben 3 роки тому ⁺²
Good video. I think we all have the right to get access to source codes and to all the work in generally which got funded by taxes. It's annoying to pay for papers when I have payed for them already.
@LeiosLabs 3 роки тому ⁺¹
That is a great point from the public perspective. People want open research, so we should truly open the research!
@user-pq9qz5zl7t 3 роки тому ⁺²
Dr Schloss, what software did you use to change the colour of your shirt?
@LeiosLabs 3 роки тому ⁺¹
Ah, blender.
Also, didn't you have a different channel before?
@user-pq9qz5zl7t 3 роки тому
@@LeiosLabs sorry, I deleted the videos you liked last time
@chetanvardhan2906 3 роки тому ⁺¹
May I ask about your experience at OIST？I was thinking about pursuing Graduate degree there… Would you recommend it？
@LeiosLabs 3 роки тому
OIST was great for me, but works best for students that are self-motivated.
@abrahamx910 3 роки тому ⁺¹
Agreed, congrats for your phD
@LeiosLabs 3 роки тому
Thanks!
@FaffyWaffles 2 місяці тому
PREACH
@saurabh7532 3 роки тому ⁺³
This is something that troubles me and made me feel weird about pursuing research for long term (pursue a PhD), I love it but I feel it requires a lot of reforms.
@LeiosLabs 3 роки тому ⁺¹
The only way that reform will happen is if we target the areas in need of change and change them ourselves.
@atharvas4399 2 роки тому ⁺¹
peer-reviewing research code is next to impossible because you need a constant supply of academics with interdisciplinary knowledge. for eg someone who has a very good understanding of computational Quantum Mechanics for a specific experiment AND simulation methods in python including familiarity with a particular tech stack
@EvilCherry3 3 роки тому
True
@kapoioBCS 3 роки тому ⁺²
As a theoretical physicist researcher , it is hard to imagine software engineers complain about funding , literally all the funding in my University goes either to software engineers or Bio- Med- researchers :p
@LeiosLabs 3 роки тому ⁺¹
That's interesting to hear / thanks for the input! I am sure a lot of funding goes into software engineering, but is it for research software or for general-purpose software?
@protocol6 3 роки тому ⁺⁴
This is only tangentially related but, as someone who learned to program long before learning any higher mathematics or physics, it has always irked me that variables and constants in mathematical formalism are even worse than hungarian notation in programming. Too many papers fail to define all their domain-specific symbols and ,if you read papers from 100 years ago, you have to chase down obsolete definitions. That's before you even get to all the situations that mix multiple definitions of the same symbol (like elementary charge and Euler's number) in the same equation. It's just begging for people to make mistakes. Mathematics could stand to import some best practices from software development.
@LeiosLabs 3 роки тому ⁺¹
Yeah! I completely agree with this! In some codes, theta is an angle. In others, it's temperature. It's hard to keep everything straight and is another complaint I have about a lot of research scripts! I am alright with it iff there is a comment somewhere in the code to some text that has similar notation or if there is a table of symbols somewhere available, but most people don't do this.
@hitoshiyamauchi 3 роки тому ⁺¹
I see the research code problem. Additionally, the code is usually not maintained after publishing. So even it was well developed, there is no guarantee it works when someone is interested in it. This is not 10 years case, but just a few years in nowadays. (For instance, I suppose CUDA version, compute architecture, and GPU architecture would be the issue in your case. Probably a matlab code runs longer.) But this is quite hard problem. And I think it should not depend on an individual's effort. Ideally, some systematic support is great. Thanks for focusing a fundamental problem.
@LeiosLabs 3 роки тому ⁺¹
totally agreed! I think a lot of researchers use the fact that software is constantly evolving as an excuse *not* to review it. The way I see it, maintenance of code is a huge issue as well and hard to do right. I think the best we can do is provide the version numbers and such at publication. This will allow the code to be run for at least a decade or so.
At this stage, just reviewing code is already a big step forward.
@AngryArmadillo 3 роки тому ⁺³
Matlab is the bane of my existence
@LeiosLabs 3 роки тому ⁺⁷
Haha, I also wanted to make a language review for matlab where I just show every single flaw I can find. It's genuinely infuriating!
@shoam2103 3 роки тому ⁺²
@@LeiosLabs please do this! Even in a light hearted way! I like array programming, but matlab's syntax and api is off-putting.
@iminni3459 3 роки тому ⁺¹
@@LeiosLabs I know nothing about Matlab but I want to watch this.
@LeiosLabs 3 роки тому ⁺²
I guess that's 2 people interested, at least! I'll think about how I might do it!
@altaroffire56 3 роки тому ⁺¹
@@LeiosLabs Make that 3. I learned Matlab in university and actually sort of like it, but I'm genuinely interested in what you have to say about its flaws.
@scottdriggers8400 3 роки тому ⁺²
I once had a summer project improving someones simulation software, and it was dreadful wading through the pages of nonsense code. (Also, Nice shirt change at 6:37)
@LeiosLabs 3 роки тому
Yeah, there were 2 instances where I needed to record part of the video on another day because I missed key information on the first recording.
@mossylikescake 3 роки тому ⁺¹
I feel like this is a personal attack....
@LeiosLabs 3 роки тому
It definitely wasn't! I've had this discussion at least a dozen times with other folks in a similar position as I am, and we all kinda agreed on these points.
I hope all is well!
@diegofloor 3 роки тому
Yikes. Physicist here. I can confirm everything you say here. I've used code written in Fortran 20 years ago in my research and it's constantly bugging out with every fortran compiler update. Why? because it was no one's priority to rewrite it in a more convenient language. When the project fell on my hands I had some time to get results and then I moved on. Someone is going to inherit this code and probably do the same thing. Then there is my analysis code on Mathematica. It contains my own implementation of a few different algorithms described in papers but there is no way to check if my implementation really is bullet proof. Collaborators don't really care as long as it produces publishable results.
What gives me some pace of mind is that in an active field of research a result wildly different from the expected will be under more scrutiny.
@MarcelloZucchi91 3 роки тому ⁺¹
Fortran is a perfectly convenient language for scientific computing. You could even argue that it is the best for certain applications. Chances are that your code was written badly from the get go (maybe using non-standard, compiler specific extensions). Fortran is and will ever be back compatible with previous versions of the standard, but compilers issue warnings if you use obsolete constructs.
@yogibairagi6354 3 роки тому
How can one become a research software engineer? Possible with no PhD?
@Ddddddddddd381 3 роки тому
In my learning I found matlab nice to get simple problems solved and make simple scripts but I wouldn’t want to do large projects in it
@LeiosLabs 3 роки тому ⁺¹
Yeah, I tried to make sure people knew I was biased about that part. I have not found a single use-case in my own research where using matlab would be better than python/julia, and most of the time, it's flat-out infuriating, losing me hours of time.
I appreciate that others have a different perspective and find the language useful, but I do not think I will ever be able to get over my biases about it.
@Ddddddddddd381 3 роки тому
@@LeiosLabs I completely agree, matlab has its quirks and aggravates me just as any other language would. Personally, I was taught my in my engineering degree with a math backbone, so the syntax of matlab for math makes somewhat sense. that being said, counting from 1 is stupid
@LeiosLabs 3 роки тому ⁺²
@@Ddddddddddd381 Counting from 1 is the least aggravating thing, honestly. I also don't mind the formula specification because I was trained as a physicist. I genuinely hate how radically different its syntax is from everything else, but I could see why some people like that. It's everything else that bugs me, like how:
- you can only have 1 function per file
- loops are slow
- structs / classes are poorly optimized
- good luck with graph / tree methods
- you cannot edit files outside of the IDE provided, otherwise matlab doesn't recognize the change
- it's licensed and checks for that license on startup... Which is doubly annoying when running matlab on distributed nodes, because it then checks licenses n times, where n = number of nodes. The license here, combined with radically different syntax actually makes the software predatory because it locks users into a system that they *have* to pay for and cannot escape from easily.
- it crashes almost every time I try to run anything reasonably complex.
I mean, if it's an esoteric language, it's an esoteric language. It might have some form of historical precedence as well, but there it still boggles my mind how people put up with it in 2020 when other, better languages exist for prototyping.
@Ddddddddddd381 3 роки тому
@@LeiosLabs what a fantastic explanation. thank you so much! I haven't personally came across those issues because I haven't done anything really complex with it, but I really do respect your expert opinion
@wailrimouche1171 3 роки тому
May I ask how old are you?
@LeiosLabs 3 роки тому ⁺²
28 right now. Started doing research at 21, so I am still relatively new to the scene. I was doing software development well before that, though!
In addition, the thoughts presented here came about from long discussions with my peers, so it's not like the arguments made were only from my own perspective!
@wailrimouche1171 3 роки тому ⁺²
@@LeiosLabs I was just asking because I was very impressed by youf qualifications. Those are some great endeavors for someone as young as you.
@DucBanal 3 роки тому ⁺¹
As a junior research instrumentation engineer, I am in the same hatred towards LabView...
@LeiosLabs 3 роки тому
Oh boy, I bet! That's another predatory licensed software that is entrenched into the research ecosystem.
@DucBanal 3 роки тому ⁺¹
@@LeiosLabs The problem is that you can't really defend any other solution because that's all researchers know and telling them to use anything else is seen as trying to "break something that works"
@LeiosLabs 3 роки тому
@@DucBanal Also agreed with this. As much as I hate the system, I don't think it's good to pull the rug up.
@random_x_ 3 роки тому ⁺²
In my opinion, the fact that research code is so bad (for computer science research) is inexcusable. Many times a CS undergrad could write code that actually works, with better documentation, that can be run by anyone on any machine and reproduced. It's just laziness, partially incompetence, and I don't think the code should be written at all if it is going to be bad. It's just that you have people who aren't trained as software engineers (yet are in computer science!) writing code and it's complete spaghetti. Do researchers know what they're doing? Yes, when it comes to writing papers. Should researchers be writing code? In my opinion, no not really. It's a waste of time if you're terrible enough at software engineering because it won't be reproducible anyways. If researchers HAVE to write bad code though, I would really hope that those who do just put in the readme something along the lines of "we just did this so we wouldn't get yelled at by reviewer #2 for not having an implementation. Our software is terrible and probably doesn't even work, and we barely remember how to get it to work because we're writing another paper already. You're better off writing your own implementation." That way I can stop wasting my time with the code, because 99% of the time it's easier to implement something from scratch with good software engineering practices than it is to try to get something built with terrible software engineering practices to work.
@LeiosLabs 3 роки тому
I can only echo your frustration. This is exactly the problem with research code today.
@woosix7735 2 роки тому
matlab lol
@td4739 Рік тому
Maybe academia should invest in Software Engineering Research.
@luismendo 3 роки тому
Even if you have a "deep and fiery hatred of Matlab" (3:42) you can't leave it out of a list titled "Types of research software" (1:07). Come on
@LeiosLabs 3 роки тому
It was there, under "languages and frameworks." I just gave julia as my example instead of numpy or matlab. It's underlying libraries were also there under the same section with blas and lapack. Again, there are way too much research software out there, so it was not possible to list them all on that slide
@platypusbox6479 6 місяців тому
Academia doesn't incentivize people to make good software - true that. I'd argue it doesn't incentivize them to do good research either.

Наступне

Автоматичне відтворення

How to color complex functions [Domain Coloring]