HN Companion◀︎ back | HN Companion home | new | best | ask | show | jobs
Coreutils for Windows (github.com/microsoft)
102 points by gigel82 1 hour ago | 88 comments


> Several commands share names with built-ins in CMD and PowerShell. Whether the Coreutils version runs depends on the shell, the PATH order, and (for PowerShell) the alias table.

Well this is not very satisfying, what about proving a way where it actually works without us having to guess where the failure root cause happens to be?


Fully-qualify the path to the target program, and it should be no concern.

A big part of the point is you can use workflows made for other platforms on windows natively, which you lose when you have to adjust to passing absolute paths

Busybox helps you avoid this nicely on Windows. When you run one of one of its shells, it uses all it's own builtins in preference to anything external.

Get the 64-bit version.

https://frippery.org/busybox/index.html


The best part is the reason it conflicts with a lot of PowerShell is PowerShell shimmed Linux commands over to their Windows equivalents for years even though the flags were different.

So ls in many systems will match the behavior of dir, and only accept the flags for dir. But if you use a system with the newer coreutils release here, ls will expect ls flags!


I wonder if the motivation is to make Ai agents work better on Windows?

For sure. I wonder how long until the agents learn about this though. At least a year, right?

Or you can tell your agent about it in one line of AGENTS.md.

nope. With ClaudeCode you can create skills ( basically markdown instructions) to teach your agents what command to use. You can also update CLAUDE.md to inject custom instructions that are feeded anything ClaudeCode is started.

If that was sufficient wouldn't it just be easier to map to the Powershell commands directly?

So dir is not shipped due to conflict with built-ins, echo and rmdir are shipped despite conflicts, and sort is deemed not to have a conflict? What is the logic?

No idea, this is broken at start, I would expect at least a reasoning on how they expect to improve the mess going forward.

Otherwise just don't do it, if it is going to be a mess to work with.


There's almost no point to this, especially since they're already shipping a (strictly) limited subset with the reasoning "not useful on Windows" despite Windows equivalent facilities _clearly_ existing. They should have at least considered a full native port.

This smells like someone promotion to get the stuff shown at BUILD, like the old sudo as runas replacement, which I don't care it exists.

"Yo make some UNIX stuff to show at BUILD as developer tools".


AI said to do it.

I think if it conflicts with a CMD command it's not shipped, but if it conflicts with a powershell command it's ok.

Windows really needs to ditch CRLF and just use LF, and switch from backslashes to forward slashes. Or better yet, just switch everything to full POSIX.

In powershell everything is much better than cmd, but it's just not enough.

WSL is generally great, but there are annoying downsides. I often get "catastrophic" crashes and the zone identifier files drive me nuts. Plus it takes so much longer to start VSCode when connecting with WSL, and now you've got two file systems. WSL1 was in many ways better than WSL2 for these reasons.


Most (everything?) on Windows actually works with forward slashes. However, much of the tooling will overwrite your version with a backslash wherever it can.

Windows is also a rare bird in UTF-16.

"UTF-16 is used by the Windows API, and by many programming environments such as Java and Qt. The variable-length character of UTF-16, combined with the fact that most characters are not variable-length (so variable length is rarely tested), has led to many bugs in software, including in Windows itself.

UTF-16 is the only encoding (still) allowed on the web that is incompatible with 8-bit ASCII. It has never gained popularity on the web, where it is declared by under 0.004% of public web pages (and even then, the web pages are most likely also using UTF-8). UTF-8, by comparison, gained dominance years ago and accounted for 99% of all web pages by 2025."

https://en.wikipedia.org/wiki/UTF-16


UTF-16 is also used by C#, Java, and JavaScript. Since JavaScript is so widely adopted, I wouldn't call it a rare bird. Not necessarily used when reading or writing files, but it's what's used internally for the strings. As a result, your strings use UTF-16 surrogate pairs to represent characters outside of the basic multilingual plane (such as Emoji).

No worries, eventually Xenix will make a comeback.

You cannot ditch CRLF, Microsoft isn't Apple.

Windows accepts backslashes and forward slashes, only old applications that manually search for one of them get it wrong.


On Windows, I've mostly avoided CRLF by configuring text editors and git to use LF, and writing text files in binary mode.

The only places that still forced CRLF were batch files and clipboard.


> I've mostly avoided CRLF by configuring text editors and git to use LF,

That has been my experience as well. I can't remember the last time I had an issue related to CRLF.


Don't think this will ever happen, especially since this is Microsoft you are talking about [0]

[0] https://www.youtube.com/watch?v=bC6tngl0PTI


> Or better yet, just switch everything to full POSIX.

Really not possible as most of POSIX semantics arise naturally from the kernel (or are enforced/executed at the kernel level). Windows technically provides some of them (or semantic equivalents) so you could make something work, but in order to do a full port you'd need to strip out too many concepts for it to be worthwhile. For instance the idea that "everything is a file" or the single root filesystem layout (which iirc is segmented deeply at the kernel level).


They might as well create a new OS with a different name, because none of the existing applications will work, and no enterprise customer will use it.

> ... Or better yet, just switch everything to full POSIX.

Interix[0] did a pretty good job of this, but MSFT killed it. I was compiling GNU tools w/ GCC and running bash under Interix back in in 2000 under Windows 2000. It was grand.

[0] https://en.wikipedia.org/wiki/Interix


No, they need to ditch drive letters first. The NT kernel and NTFS don't even require them (I used to mount disks without drive letters back in the NT 4 era). They just don't care enough to get rid of this annoyance.

Nobody wants to use \??\GLOBALROOT\Device\HarddiskVolume3\ in their paths.

users , especially non-technical, find it highly useful in my experience. Is it a net positive to get rid of them, or will it largely only make developers happier ?

the two filesystems can be a super power... i seamlessly use the same driver between wsl2 and my dual booted opensuse.

Yeah I don't mind/like the two file systems. Looks like MS is taking it further too they also announced WSL Containers & API.

honestly your point is a bit weird.

powershell is good. its much better than unix's everything piped is Text idea. godawfull that. outputs being objects is a really solid take.

WSL is trash.

besides that, lf vs. crlf is silly as you mention but crlf is more logical considering what its implementing. that being said the notion of these control chars is already based on outdated and limited ideas.

if you want a consistent system to do things with dont pick a system which tries to be two systems.

Linux has wine. Windows has WSL.

I'd recommend BSD. any flavor will do.

might take some adjustments but you will have a more 'rational' system if that is what you desire.

(otherwise, embrace the madness!)


> outputs being objects is a really solid take.

Glad I'm not alone here ha!

Being able to go someoutput | Format-Table | Select ColumnName,ColumnName,CloumnName is great. Beats memorizing the output format of any specific command and trying to wrangle it with awk.


>Windows really needs to ditch CRLF...

Windows needs to ditch itself.


You can install gnu-compative shell commands when installing git for Windows. It even includes useful unix utilites like bash, so check it out if you're interested.

More project information: https://gitforwindows.org/

Official download: https://git-scm.com/install/windows


Or MSYS2, to go even further. https://www.msys2.org/

I would have liked to see head, tail, tr, uniq, and cut. I end up dragging over the old "gnuwin32" versions of those to a lot of Windows machines. Those are my go-to tools for quick-and-dirty log analysis.

I know I could use Powershell for those kinds of tasks, and I certainly do make a lot of use of Powershell, but the familiarity of those simple tools and the decades-old "muscle memory" of using them on various Unix, Linux, and Windows boxes makes them hard to ditch.


Windows has lacked decent ports of recent GNU tools for a while. I still use some very old ones. It would be great if MS worked on the other tool groups like textutils.

I use those commands also to filter output and fee ai agents with that. Tail and Head are my favourites to avoid wasting tokens. Wayy too many fancy build logs messages.

I feel like I'm seeing an error, or I just don't understand what they mean w/ "find" and "Integrated port of the original DOS command" and not listed as conflicting.

There's a "%SystemRoot%\System32\find.exe" on every Windows NT-derived OS. That's absolutely a conflict.

Also, the "find" command from "findutils" is in no way functionally similar to the "original DOS command" (which is for finding text in files).

Aside: Eschew "find.exe" on Windows for "findstr.exe". The latter is vastly more efficient. I discovered that by happenstance once and have trained my hands to type "findstr" when I mean "find" on Windows.


What does this do that Cygwin doesn't?

Cool. I'm already using cygwin for a lot of these utilities. Would the Microsoft versions have any advantages?

No .DLLs would be the primary reason.

In the intentionally dropped section, it lists shed as "Not particularly useful on Windows." Does anyone know why? Is thre already a shred-like command in Windows?

From shred man:

The shred command relies on a crucial assumption: that the file system and hardware overwrite data in place.

...

many modern file system designs do not satisfy this assumption. Exceptions include:

...

Log-structured or journaled file systems, such as.

...

NTFS.


I think SSDs also randomize where data ends up? But I'm not sure if that's true for existing files too.

Yes. All of the assumptions made with shred and sdelete apply only for spinning HDDs. SSDs require different methods of wiping.

Is there no way to track down where the data actually lives?

It depends on the firmware running on the SSD, so theoretically it’s possible but practically it’s not. Instead, SSDs use a special command to zero all cells on the chip at once, so it’s all or nothing. You can’t target specific files.

To clarify: the host can issue a command to the SSD to securely wipe the whole drive including spare area that is not directly accessible to the host. The SSD controller in the drive issues erase commands to the NAND to erase individual erase blocks, with typical sizes on the order of 16MB.

The SSD controller does not usually keep a history of where older versions of a block of data were stored, so it's not practical to erase an individual file and catch any partial older versions that may not yet have been garbage collected.


The filesystem may choose to store new data at different logical block addresses than older versions. The SSD will definitely choose to store those newly written blocks at different physical addresses, both for the sake of wear leveling and for performance, because a read-erase-rearite cycle on an entire NAND flash erase block (several MB at minimum) is a very slow operation.

I assume it requires something exposed by the underlying filesystem.

No. Shred will "work" - as in, compile, run, and have the expected logical effects of ultimately removing the file from the directory index - on any filesystem backed by any block device. The problem is that overwriting any part of a file is not guaranteed to actually erase the overwritten data. Actually, it never has been; shred is kind of a hack that assumes an overwriting file system driver and a block device dumb enough to not remap sectors writing to media that's intrinsically erasable. e.g. try running shred on a mounted CD-R and see how far that gets you.

https://frippery.org/busybox/

winget install -e --id frippery.busybox-w32


There's also a windows port of busybox if you want something more stable. w64devkit uses it.

https://github.com/rmyorston/busybox-w32


Not similar at all - too heavy weight when you just want to use a small tool.

This one might interest you, although it's quite old.

https://unxutils.sourceforge.net/

Busybox's shell is ash, but the above set includes and old zsh IIRC.

Note also that the Frippery Windows busybox is available as a 64-bit version, in case 2gb is not enough (easy with some big awk associative arrays).


Microsoft provides an awesome problem-solving solution, and 0 of the 6 problems they listed in the 'Windows caveats' section are solved.

FINALLY. This is actually exciting to me... Mind you the linux ports (cygwin, msys2, git bash) are all great to have and I make sure one version or the other is always on my path but having MS maintain them (assuming they continue to do so) is great news

No thanks, I'd rather use linux.

Is this only on windows 11 or does it support 10 as well? (i cant access the site rn because of wifi)

Windows 10 is end-of-life; so the question itself is odd. It may work on Windows 10, but they definitely don't support an EoL version of their OS.

they should give up on the backwards slash.

They should give up.

Nice. I appreciate the effort to make things less painful for powerusers. I had noticed some of these working already in PS.

If anyone from MS is reading this can we please also get an equivalents (or even alias) for the thing that shows IP address? The windows equivalent of "ip a" is some convoluted PS command that I can never remember


in PowerShell there is a built-in alias.

> gip

You could also make your own alias if you specifically want to type "ip a" just add a powershell function to your $PROFILE. function ip { param($argument)...." etc. have it call Get-NetIPAddress, else fallback to ipconfig.


ipconfig works pretty well.

Isn't this just a restricted uutils fork? With most functionality culled for no good reason? "uname isn't useful on Windows" how? OSName/ Build numbers / systeminfo all exist?

That's actually a good idea. Now, I am a Linux person, but I have windows on a secondary machine. Compiling on Linux is trivial.

On Windows it is possible of course, WSL, msys, what not, but it is cumbersome. And I hate the default compiler on windows. So if coreutils on windows helps simplify all the base toolchain, I am all in favour of it. Windows really needs to make compiling stuff a LOT easier by default. I don't want to download some x GB of stuff I don't really need.


Is 'dir' a Linux command?

Would it make sense to add a prefix to all commands to avoid conflicts with built-in commands? Like how, on macOS and FreeBSD, installing GNU Coreutils adds a `g` prefix, Microsoft could add an `m` prefix to these commands.

Native Coreutils for Windows is genuinely some good news coming from Microsoft.

The command line team has been doing some solid work for a while. I recognize lhecker from the also great wt & edit projects.

If you told me during the Windows 7 era the Windows CLI would not only be getting nice but getting pretty comfortable I would never have believed it.


Yeah, the problem with Windows isn't the command line team, the problem is the marketing & sales div using windows to push every other MS service.

If they just kicked them out and left the Windows div alone it'd be a decent OS. All the bones are there.


Hopefully these do not require a PhD to be implemented.

With these AI agents using lot of linux commands, I think they have to do it.

Busybox for Windows is the best implementation of coreutils for it, far and away. The maintainer is also very knowledgeable and responsive and actually merges community PRs which is incredible. Microsoft isn't going to do that, so why bother? Microsoft's solution will be a hot buggy mess that needs its own workaround and quirks day 1.

I just have msys on my PATH.

A fair question is why this fork of coreutils is required when the original Rust rewrite (https://github.com/uutils/coreutils/) supports Windows, in addition to Linux, macOS and wasm.

The reason seems to be a few windows specific fixes (https://github.com/uutils/coreutils/compare/main...microsoft...) which can probably be upstreamed into the main repo.


So “supports Windows” doesn’t mean supports Windows.

Apparently the creator of the fork is also a maintainer of some uutils repositories.

uutils coreutils was/is already available and more complete than this

citation needed

Exactly. The best Linux distro is Windows.

The case for that statement (if there's a case at all) is wsl. This is outside wsl.

Was not expecting EEE for Coreutils but I suppose it’s the natural consequence of the MIT license used for uutils so not totally unexpected.

It’s annoying enough to support the differences between BSD and Linux, and now Linux has GNU and uutils, and now we’re gonna need Windows variant of uutils…ugh.


It was really obvious.

Microsoft "loves" Linux for years and the entire point was to bring the Linux userspace on the Windows Desktop.